WO2022228056A1 - Human-computer interaction method and device - Google Patents

Human-computer interaction method and device Download PDF

Info

Publication number
WO2022228056A1
WO2022228056A1 PCT/CN2022/085282 CN2022085282W WO2022228056A1 WO 2022228056 A1 WO2022228056 A1 WO 2022228056A1 CN 2022085282 W CN2022085282 W CN 2022085282W WO 2022228056 A1 WO2022228056 A1 WO 2022228056A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion sensing
sensing data
initial
processor
data
Prior art date
Application number
PCT/CN2022/085282
Other languages
French (fr)
Chinese (zh)
Inventor
解文博
赵安
陈维
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022228056A1 publication Critical patent/WO2022228056A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Definitions

  • the present application relates to the field of terminal applications, and in particular, to a human-computer interaction method and related equipment.
  • Embodiments of the present application provide a human-computer interaction method and related equipment for controlling a cursor in a display device through asynchronous calibration, which can improve the continuity of cursor movement in the display device, thereby improving user experience.
  • a first aspect of the embodiments of the present application provides a human-computer interaction method, and the method can be applied to a human-computer interaction system including a motion sensor, a camera, a processor, and a display screen, and the method includes: the motion sensor acquires a first The initial motion sensing data of the first sampling frequency within a period of time, the initial motion sensing data is triggered by the user's limb movements; the camera acquires the first image data of the second sampling frequency within the first period of time , the second sampling frequency is less than the first sampling frequency, and the first image data includes the user's limb movement information; after that, the processor obtains a first constraint condition, the first constraint The first image data is obtained through computer vision CV processing; and the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data; further, the processor Control information is obtained according to the target motion sensing data, and the control information is used to control the display screen.
  • the camera obtains the first image data based on the second sampling frequency in the first time period
  • the camera obtains the first constraint condition through CV key point recognition
  • the processor determines the first constraint condition based on the first constraint condition.
  • the initial motion sensing data obtained based on the first sampling frequency is calibrated within the time period to obtain target motion sensing data, and thereafter, the processor further obtains control information for controlling the display screen according to the target motion sensing data.
  • the second sampling frequency is lower than the first sampling frequency, that is, the processor performs asynchronous calibration on the initial motion sensing data to obtain target motion sensing data.
  • the calculation time of the CV identification process is generally much longer than the processing time of the IMU data.
  • the asynchronous calibration implementation method does not need to wait for a long calculation time.
  • the long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
  • processor and the motion sensor may be installed in the same electronic device, or the processor and the camera may be installed in the same electronic device, or the processor and the display screen may be installed in the same electronic device. There are no restrictions.
  • the first constraint condition is obtained by recognizing human skeleton key points in computer vision CV processing on the first image data, and the first constraint condition includes three-dimensional Spatial orientation angle information.
  • the CV processing may be implemented based on a three-dimensional human skeleton recognition technology, or may be implemented based on a two-dimensional human skeleton recognition technology, which is not limited here.
  • the first constraint condition for asynchronous calibration of the initial motion sensing data may be three-dimensional space orientation angle information obtained by CV identification processing.
  • the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data, which specifically includes: the processor first according to the The first constraint condition is mapped to obtain calibration data, and a first curve is obtained by fitting based on the calibration data; then, the processor is fitted to obtain a second curve according to the initial motion sensing data, and the first curve is fitted by the processor. Perform weighted average processing with the second curve to obtain a third curve; and, the processor determines the calibrated motion sensing data in the third curve; after that, the processor processes according to the attitude calculation algorithm The calibrated motion sensing data is used to obtain the target motion sensing data.
  • the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm.
  • the initial motion sensing data may be firstly calibrated. The data is calibrated, and then processed by the attitude calculation algorithm to obtain the target motion sensing data.
  • the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data, which specifically includes: the processor first settles a settlement according to the posture The algorithm processes the initial motion sensing data to obtain first attitude angle data; then, the processor obtains a fourth curve by fitting according to the first attitude angle data, and obtains a fourth curve by fitting according to the first constraint condition Five curves; after that, the processor performs weighted average processing on the fourth curve and the fifth curve to obtain a sixth curve; further, the processor determines the target movement in the sixth curve sensor data.
  • the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm.
  • the initial motion sensing data may be firstly calibrated.
  • the data is processed by an attitude calculation algorithm, and after the processing result is obtained, the processing result is calibrated based on the first constraint condition to obtain the target motion sensing data.
  • control information is coordinate data obtained by performing coordinate transformation on the target motion sensing data, and the coordinate data is used to control the display position of the cursor in the display screen
  • control information is a gesture identification result obtained by mapping the target motion sensing data, and the gesture identification result is used to operate an interface element of the display screen.
  • control information for controlling the display screen obtained by processing the target motion sensing data obtained by asynchronous calibration can perform various operations on the display screen, such as controlling the display position of the cursor in the display screen, controlling Interface elements in the display, such as selection, zoom, drag, click, etc.
  • the method before the processor calibrates the initial motion sensing data according to the first constraint condition, the method further includes: the processor aligns the first motion sensing data according to a time difference a constraint condition and the initial motion sensing data.
  • the first constraint condition and the initial motion sensing data may be aligned through the determined time difference, so as to eliminate the effect of the time difference.
  • the time difference is calculated through an initialization process before the first time period, and the method further includes: displaying, on the display screen, first prompt information for prompting
  • the user makes a designated limb movement; thereafter, the motion sensor acquires motion sensing data in the initialization process, the motion sensing data in the initialization process being triggered by the designated limb movement made by the user; and , the camera acquires the image data in the initialization process, and the image data in the initialization process includes the specified limb movement information made by the user; further, the processor is based on the movement in the initialization process.
  • the time difference is determined by the signal characteristics of the sensory data and the signal characteristics of the image data during the initialization process.
  • the user is prompted on the display screen to make the user perform a specific limb action, and during the process of the user performing the specific limb action, the motion sensing data in the initialization process is obtained through a motion sensor, The image data in the initialization process is acquired through the camera, and the processor further processes based on the motion sensing data and the image data to determine the time difference.
  • the method further includes: the processor determining initial relative information between the user and the camera according to the image data in the initialization process.
  • the initial relative information may include distance, orientation, and the like.
  • the processor calibrates the initial motion sensing data according to the first constraint condition, and obtaining the target motion sensing data includes: the processor according to the initial relative information Determine an initial human arm engineering model, the initial human arm engineering model includes at least a first range of limb movement angles; then, the processor updates the first constraint condition according to the initial human arm engineering model to obtain an updated After that, the processor calibrates the initial motion sensing data according to the updated first constraint condition to obtain the target motion sensing data.
  • the human arm engineering model constructed by the relative information between the user and the display screen can be used for A constraint condition is updated, that is, the first constraint condition is further constrained based on the ergonomic model of the human arm, so as to avoid the problems of inaccurate cursor and cursor overflow.
  • the process of obtaining the updated first constraint condition may specifically include: the processor: Determine the first relative information between the user and the camera according to the first image data; then, when the first relative information is different from the initial relative information, the processor determines the first relative information according to the first relative information updating the initial human arm engineering model to obtain a first human arm engineering model; after that, the processor updates the first constraint condition according to the first human arm engineering model to obtain the updated first constraint condition .
  • the ergonomic arm model constructed based on the relative information between the user and the camera can also be updated based on different relative information. And use the updated human arm engineering model to further constrain the first constraint condition to ensure the timeliness of the control information to control the display screen.
  • obtaining, by the processor, the control information according to the target motion sensing data includes: first, the processor determines, according to the initial relative information, a user's status in the display device initial mapping relationship; then, the processor performs coordinate transformation processing on the target motion sensing data according to the initial mapping relationship to obtain the control information.
  • the initial mapping relationship of the user in the display screen can be further determined according to the initial relative information determined in the initialization process, and the initial mapping relationship can be used as the processing basis of the control information, so as to avoid the positioning generated when the relative information is changed. Inaccurate, cursor overflow and other issues.
  • the processor performs coordinate transformation processing on the target motion sensing data according to the initial mapping relationship, and obtaining the control information includes:
  • the first image data determines first relative information between the user and the camera; then, when the first relative information is different from the initial relative information, the processor updates the first relative information according to the first relative information.
  • the initial mapping relationship is obtained to obtain a first mapping relationship; after that, the processor performs coordinate transformation processing on the target motion sensing data according to the first mapping relationship to obtain the control information.
  • the initial mapping relationship determined based on the relative information between the user and the camera can also be updated based on different relative information, and Using the updated mapping relationship to perform coordinate transformation processing on the target motion sensing data to obtain control information, so as to ensure the timeliness of the control information to control the display screen.
  • the motion sensor includes a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
  • the motion sensor may be an IMU data acquisition device, wherein the IMU data acquisition device may include a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
  • the camera includes one or more cameras selected from a depth camera and a non-depth camera.
  • the camera may include various implementations, such as a depth camera, a non-depth camera, etc., so that the solution is suitable for different application scenarios.
  • a second aspect of the embodiments of the present application provides a first electronic device, including a motion sensor and a processor, wherein the motion sensor is used to acquire initial motion sensing data of a first sampling frequency in a first time period, and the The initial motion sensing data is triggered by the user's limb movements; in addition, the processor is configured to calibrate the initial motion sensing data according to the acquired first constraint condition to obtain target motion sensing data, wherein the first The constraint condition is obtained by performing computer vision CV processing on the first image data of the second sampling frequency obtained by the camera in the first time period, the second sampling frequency is less than the first sampling frequency, and the first sampling frequency is smaller than the first sampling frequency.
  • the image data includes the user's limb movement information; further, the processor is further configured to obtain control information according to the target motion sensing data, where the control information is used to control the display content of the display screen; wherein, the camera and the display screen is contained in a second electronic device different from the first electronic device.
  • the initial motion sensing data of the first sampling frequency in the first time period is obtained by the motion sensor, and the first electronic device also obtains the first The first constraint condition obtained by performing computer vision CV processing on the first image data of the second sampling frequency in the time period, after that, the processor in the first electronic device further based on the first constraint condition
  • the initial motion sensing data obtained at a sampling frequency is calibrated to obtain target motion sensing data, after which the processor further obtains control information for controlling the display screen according to the target motion sensing data.
  • the second sampling frequency is lower than the first sampling frequency, that is, the processor performs asynchronous calibration on the initial motion sensing data to obtain target motion sensing data.
  • the calculation time of the CV identification process is generally much longer than the processing time of the IMU data.
  • the asynchronous calibration implementation method does not need to wait for a long calculation time.
  • the long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
  • the processor is specifically configured to: map and obtain calibration data according to the first constraint condition, and obtain a first curve by fitting based on the calibration data; and then, according to the Fitting the initial motion sensing data to obtain a second curve, and performing weighted average processing on the first curve and the second curve to obtain a third curve; and determining the calibrated motion in the third curve Sensing data, further processing the calibrated motion sensing data according to an attitude calculation algorithm to obtain the target motion sensing data.
  • the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm.
  • the initial motion sensing data may be firstly calibrated. The data is calibrated, and then processed by the attitude calculation algorithm to obtain the target motion sensing data.
  • the processor is specifically configured to: process the initial motion sensing data according to an attitude settlement algorithm to obtain first attitude angle data; and then, according to the first attitude angle A fourth curve is obtained by data fitting, and a fifth curve is obtained by fitting according to the first constraint condition, and a weighted average process is performed on the fourth curve and the fifth curve to obtain a sixth curve; The target motion sensing data is determined in the sixth curve.
  • the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm.
  • the initial motion sensing data may be firstly calibrated.
  • the data is processed by an attitude calculation algorithm, and after the processing result is obtained, the processing result is calibrated based on the first constraint condition to obtain the target motion sensing data.
  • the processor is further configured to: align the first constraint condition and the initial motion sensing data according to a time difference.
  • the first constraint condition and the initial motion sensing data may be aligned through the determined time difference, so as to eliminate the effect of the time difference.
  • the time difference is calculated through an initialization process before the first time period; the motion sensor is further configured to acquire motion sensing data in the initialization process , the motion sensing data in the initialization process is triggered by the specified limb movements made by the user; in addition, the processor is also used for signal characteristics of the motion sensing data in the initialization process and all The time difference is determined by the signal characteristics of the image data in the initialization process, wherein the image data in the initialization process is acquired by the camera in the initialization process, and the image data in the initialization process includes the user The specified body motion information made.
  • the user is prompted on the display screen to make the user perform a specific limb action, and during the process of the user performing the specific limb action, the motion sensing data in the initialization process is obtained through a motion sensor, The image data in the initialization process is acquired through the camera, and the processor further processes based on the motion sensing data and the image data to determine the time difference.
  • the processor is further configured to: determine initial relative information between the user and the camera according to the image data in the initialization process.
  • the initial relative information may include distance, orientation, and the like.
  • the processor is specifically configured to: determine an initial ergonomic model of the human arm according to the initial relative information, where the initial ergonomic model of the human arm includes at least a first movement angle of a limb. value range; then, update the first constraint condition according to the initial human arm engineering model to obtain the updated first constraint condition; thereafter, calibrate the initial motion sensing data according to the updated first constraint condition to obtain the target motion sensing data.
  • the human arm engineering model constructed by the relative information between the user and the display screen can be used for A constraint condition is updated, that is, the first constraint condition is further constrained based on the ergonomic model of the human arm, so as to avoid the problems of inaccurate cursor and cursor overflow.
  • the processor is specifically configured to: determine first relative information between the user and the camera according to the first image data; and then, in the first relative information
  • the initial human arm engineering model is updated according to the first relative information to obtain a first human arm engineering model; further, the processor is based on the first human arm engineering model.
  • the first constraint condition is updated to obtain the updated first constraint condition.
  • the ergonomic arm model constructed based on the relative information between the user and the camera can also be updated based on different relative information. And use the updated human arm engineering model to further constrain the first constraint condition to ensure the timeliness of the control information to control the display screen.
  • the processor is further configured to: first determine the initial mapping relationship of the user in the display device according to the initial relative information; The target motion sensing data is subjected to coordinate conversion processing to obtain the control information.
  • the initial mapping relationship of the user in the display screen can be further determined according to the initial relative information determined in the initialization process, and the initial mapping relationship can be used as the processing basis of the control information, so as to avoid the positioning generated when the relative information is changed. Inaccurate, cursor overflow and other issues.
  • the processor is specifically configured to: determine first relative information between the user and the camera according to the first image data; and then, in the first relative information When the information is different from the initial relative information, update the initial mapping relationship according to the first relative information to obtain a first mapping relationship; further, coordinate the target motion sensing data according to the first mapping relationship The conversion process is performed to obtain the control information.
  • the initial mapping relationship determined based on the relative information between the user and the camera can also be updated based on different relative information, and Using the updated mapping relationship to perform coordinate transformation processing on the target motion sensing data to obtain control information, so as to ensure the timeliness of the control information to control the display screen.
  • the motion sensor includes one or more sensing units of an accelerometer, a gyroscope, and a magnetometer.
  • the motion sensor may be an IMU data acquisition device, wherein the IMU data acquisition device may include a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
  • the motion sensor and the processor included in the first electronic device in the second aspect can also perform the implementation process in the first aspect and any possible implementation manner thereof, and achieve corresponding beneficial effects, which are not described here. Repeat them one by one.
  • a third aspect of the embodiments of the present application provides a second electronic device, including a camera and a display screen; wherein the camera is used to acquire first image data of a second sampling frequency in a first time period, and the first image
  • the data includes user limb movement information; wherein the first image data is used to determine a first constraint condition, and the first constraint condition is used to calibrate initial motion sensing data to obtain target motion sensing data, and the initial motion
  • the sensing data is obtained by sampling the motion sensor in the first electronic device based on a first sampling frequency within the first period of time, and the initial motion sensing data is triggered by the user's limb movement; the second sampling frequency is less than the first sampling frequency; thereafter, the display screen is used to display control information, wherein the control information is obtained based on the target motion sensing data.
  • the first image data of the second sampling frequency obtained by the camera in the first time period is used to determine the first constraint condition
  • the first image data is used to determine the first constraint condition.
  • the constraint condition may be to calibrate the initial motion sensing data obtained based on the first sampling frequency within the first time period to obtain target motion sensing data, and then further obtain the target motion sensing data for controlling the display screen according to the target motion sensing data. control information, so that the display screen displays the control information.
  • the second sampling frequency is lower than the first sampling frequency, that is, the target motion sensing data is obtained by asynchronously calibrating the initial motion sensing data.
  • the calculation time of the CV identification process is generally much longer than the processing time of the IMU data.
  • the asynchronous calibration implementation method does not need to wait for a long calculation time.
  • the long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
  • the first constraint condition is obtained by recognizing human skeleton key points in computer vision CV processing on the first image data, and the first constraint condition includes three-dimensional Spatial orientation angle information.
  • the CV processing may be implemented based on a three-dimensional human skeleton recognition technology, or may be implemented based on a two-dimensional human skeleton recognition technology, which is not limited here.
  • the first constraint condition for asynchronous calibration of the initial motion sensing data may be three-dimensional space orientation angle information obtained by CV identification processing.
  • control information is coordinate data obtained by performing coordinate transformation on the target motion sensing data, and the coordinate data is used to control a cursor on the display screen or
  • control information is a gesture identification result obtained by mapping the target motion sensing data, and the gesture identification result is used to operate the interface element of the display screen.
  • control information for controlling the display screen obtained by processing the target motion sensing data obtained by asynchronous calibration can perform various operations on the display screen, such as controlling the display position of the cursor in the display screen, controlling Interface elements in the display, such as selection, zoom, drag, click, etc.
  • the display screen is further configured to display first prompt information, which is used to prompt the user to make a specified physical action; in addition, the camera is also configured to display first prompt information at the first time
  • the image data in the initialization process is acquired in the initialization process before the segment, and the image data in the initialization process includes the specified body motion information made by the user; wherein, the signal characteristics of the image data in the initialization process determining the time difference with the signal characteristics of the motion sensing data during the initialization, the time difference being used to align the first constraint and the initial motion sensing data, the motion sensing data during the initialization is acquired by the second electronic device during the initialization process,
  • the first constraint condition and the initial motion sensing data may be aligned through the determined time difference, so as to eliminate the effect of the time difference.
  • the camera includes one or more cameras selected from a depth camera and a non-depth camera.
  • the camera may include various implementations, such as a depth camera, a non-depth camera, etc., so that the solution is suitable for different application scenarios.
  • the camera and display screen included in the second electronic device in the third aspect can also perform the implementation process in the first aspect and any possible implementation manner thereof, and achieve corresponding beneficial effects, which are not the same here.
  • a fourth aspect of an embodiment of the present application provides an electronic device, including a processor, where the processor is coupled to the memory; the memory is used to store a program; the processor is used to execute a program in the memory The program causes the execution device to execute the human-computer interaction method described in the above aspects.
  • the motion sensor mentioned in the above-mentioned human-computer interaction method can be integrated in the electronic device, or can be independently provided outside the electronic device and connected to the electronic device in a wired/wireless manner. Do limit.
  • the camera mentioned in the above-mentioned human-computer interaction method may be integrated in the electronic device, or independently provided outside the electronic device and connected to the electronic device in a wired/wireless manner, which is not limited here.
  • the display screen mentioned in the above-mentioned human-computer interaction method can be integrated in the electronic device, or can be independently provided outside the electronic device and connected to the electronic device in a wired/wireless manner, which is not limited here. .
  • a fifth aspect of the embodiments of the present application provides a computer program, which, when running on a computer, enables the computer to execute the human-computer interaction method described in the first aspect and any implementation manner thereof.
  • a sixth aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it runs on a computer, causes the computer to execute the first aspect and any implementation thereof method of human-computer interaction.
  • a seventh aspect of an embodiment of the present application provides a circuit system, where the circuit system includes a processing circuit, and the processing circuit is configured to execute the human-computer interaction method described in the first aspect and any implementation manner thereof.
  • An eighth aspect of an embodiment of the present application provides a chip system, where the chip system includes a processor, configured to support implementing the functions involved in the first aspect and any implementation manner thereof, for example, sending or processing the functions involved in the foregoing method. the data and/or information involved.
  • the chip system further includes a memory for storing necessary program instructions and data of the server or the communication device.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • Fig. 1 is a schematic diagram of human-computer interaction realization
  • FIG. 2 is a schematic diagram of an application scenario in an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 4 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 5 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 6 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 7 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 8 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 9 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 10 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 11 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a first electronic device according to an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a second electronic device according to an embodiment of the present application.
  • full-scene immersive experience has become the development trend of terminal equipment in the process of user-computer interaction using terminal equipment.
  • various application scenarios for full-scene immersive experience such as extended reality (XR) through virtual reality (VR), augmented reality (AR) or mixed reality (MR), etc.
  • XR extended reality
  • VR virtual reality
  • AR augmented reality
  • MR mixed reality
  • human-computer interaction or an application scenario for human-computer interaction with devices with display screens such as computers, TVs, and smart screens (or large screens).
  • display screens such as computers, TVs, and smart screens (or large screens).
  • the user's limb movement is a direct and convenient input method
  • a wearable device with sensors such as an inertial measurement unit (IMU) can be used as a medium to collect the user's limbs (for example, hand, wrist, wrist, etc.). It can also use the user's body movement information carried by the image or video captured by the camera as feedback to identify the user's operation intention, so that the user can realize human-computer interaction in a clearer and smoother way.
  • IMU inertial measurement unit
  • an air mouse mode Take the process of human-computer interaction between a user and a device with a display screen through an air mouse mode (also referred to as an air mouse mode, an air operation mode, etc.) as an example.
  • the air mouse mode can refer to the wireless mouse or wireless control device by adding sensors (such as gyroscope, 3-dimension gravity sensor, 3D-Gsensor, etc.) to the wireless mouse or wireless control device.
  • sensors such as gyroscope, 3-dimension gravity sensor, 3D-Gsensor, etc.
  • a human-computer interaction mode in which the cursor on the display screen can follow the movement of the user's limbs in the air without being placed on a fixed desktop.
  • the air mouse mode can also be extended to a human-computer interaction mode in which a terminal device (such as a wearable device such as a watch and a wristband) is used to control the cursor on the display screen (for operations such as moving, dragging, zooming in, clicking, etc.).
  • a terminal device such as a wearable device such as a watch and a wristband
  • the air mouse mode mainly includes the following mainstream implementation methods:
  • Method 1 Use a wearable device with an IMU to recognize the user's body movements, and control the cursor on the display screen according to the recognition result, so as to realize human-computer interaction.
  • the main components included in the IMU include a gyroscope, an accelerometer, and a magnetometer.
  • the gyroscope can detect the angular velocity signal of the wearable device relative to the navigation coordinate system (such as the ground-fixed coordinate system, the geographic coordinate system, etc.)
  • the accelerometer can detect the three-axis of the wearable device in the carrier coordinate system (such as the wearable device
  • the origin of the coordinates is the origin of the center of the carrier, and the three axes include the acceleration signals along the left and right directions of the carrier, the front and rear axes of the carrier, and the up and down axes of the carrier), and the magnetometer can obtain the information of the magnetic field around the smart watch.
  • the main function of the IMU is to fuse the data of the three sensors, the gyroscope, the accelerometer and the magnetometer, and obtain more accurate attitude information through the processing of the attitude calculation algorithm, and recognize the user's limb movements based on the attitude information.
  • the attitude calculation algorithm may include a mahony algorithm, a Kalman filter algorithm, and the like. This implementation method requires less computing power and has good real-time performance, which makes the cursor position on the display screen refresh quickly and track smoothly.
  • the second method is to recognize specific gestures in the user's body movements through computer vision (computer vision, CV) recognition technology, and control the cursor on the display screen according to the recognition result, so as to realize human-computer interaction.
  • computer vision computer vision, CV
  • various devices may include cameras (e.g., depth cameras or non-depth cameras, etc.) and/or other sensors (e.g., photosensors in photoplethysmography (PPG), infrared, radar, etc.).
  • PPG photoplethysmography
  • radar infrared
  • the camera is taken as an example of a depth camera.
  • the depth camera can collect image information of a specific gesture performed by the user, and the processor uses a pre-established image recognition model to perform CV recognition based on the image information to obtain the first judgment result; other sensors collect The sensor signal of the specific gesture performed by the user, the processor uses the pre-established sensor recognition model to recognize based on the sensor signal, and obtains the second determination result; after that, the processor performs fusion processing based on the first determination result and the second determination result , after determining the specific gesture performed by the user, the processor controls the cursor in the display screen according to the control operation corresponding to the specific gesture.
  • the image recognition model and the sensor recognition model both include the correspondence between specific gestures and control operations on the display screen.
  • the control operation of gesture recognition depends on the long processing time of the depth sensor for CV recognition, which is easy to cause problems such as the cursor display freeze and display delay corresponding to the control operation of gesture recognition, resulting in poor user experience.
  • Mode 3 Integrate the solution using IMU positioning in mode 1 and the solution using CV recognition in mode 2 to realize a human-computer interaction mode of real-time calibration.
  • the device including the IMU identifies the gesture information of the user's limb movements to obtain the initial positioning information; at the same time, by including a camera (similar to the second mode, the camera may be a depth camera or a non-depth camera). , where the camera is used as a depth camera as an example) to identify and locate the image information of the user's limb movements to obtain calibration information; after that, the initial positioning information is calibrated according to the calibration information to obtain a calibration result, and based on the calibration result, the display The cursor on the screen is operated to realize human-computer interaction.
  • the method can reduce the spatial translation tracking error existing in the IMU calculation process through the calibration process, and realize real-time tracking of the user's limb movements.
  • FIG. 1 Exemplarily, the implementation process shown in FIG. 1 is used as an example to illustrate the human-computer interaction mode of the real-time calibration.
  • the user operates the device containing the IMU to form the movement trajectory of the air mouse, so that the device containing the IMU can collect the IMU data;
  • step S3 Collect image data through the depth camera, and perform CV recognition processing based on the image data, and the obtained CV recognition result is used as the basis for real-time calibration in step S4;
  • the IMU data obtained by the device comprising the IMU in step S2 track and determine the offset angle data in real time;
  • step S6 display the coordinate data (X, Y) obtained in step S5 on the screen in real time, or, instead of displaying the coordinate data (X, Y), at the screen position corresponding to the coordinate data (X, Y), Display the cursor (where the cursor can be an icon/graphic/image of any size, shape, transparency).
  • the third method can solve the problem of inaccurate positioning caused by using the IMU only for cursor positioning to a certain extent
  • the processing process of the CV recognition by the device including the depth camera is different from that of the device including the IMU to the IMU data.
  • the calculation amount of the former is much larger than that of the latter, resulting in a large gap between the calculation time of the two. That is to say, each frame displayed by the cursor on the display screen is completely dependent on the real-time calibration of CV, and is limited by the hardware computing capability.
  • the duration of each calculation (usually a few milliseconds or more than ten milliseconds) of the air mouse movement trajectory is formed based on the data collected by the IMU, so that the human-computer interaction method of real-time calibration (ie method 3) needs to wait for the CV identification process for a long time. .
  • the processing time of the IMU data is 10 milliseconds each time, and the processing time of each CV identification process is 200 milliseconds.
  • the duration interval is denoted as (0,1000], and the duration interval unit described here and the subsequent description is milliseconds.
  • the CV identification result in step S3 is ( 200, 400, 600, 800, 1000) of the CV key point data at a total of 5 moments, in order to realize the synchronous calibration of the IMU data based on the CV identification result, it is necessary to limit the moment when the IMU data is collected in step S2 to be the same as the 5 moments.
  • step S6 The cursor data displayed in real time on the screen is only 5 times of data, that is, the refresh frequency of the cursor on the screen can only be the same as the CV recognition frequency, which is at most 5 Hz (Hz), which is likely to cause problems such as cursor display freezes and display delays. lead to poor user experience.
  • the embodiments of the present application provide a human-computer interaction method and related equipment, which are used to control the cursor in the display device through asynchronous calibration, which can improve the continuity of the cursor movement in the display device. Thereby improving the user experience.
  • the motion sensor is only taken as an example of an IMU data acquisition device.
  • the motion sensor may also be other devices, such as an accelerometer data acquisition device, a gyroscope data acquisition device, and a magnetometer data acquisition device. device, or other devices, which are not limited here.
  • the display screen can be included in the display device, and the display device can also include a base for carrying the display screen, physical buttons or touch screen buttons for controlling the parameters of the display screen (such as brightness, contrast, etc.)
  • the power supply of electrical energy, the wired/wireless communication module for transmitting control instructions for the display screen, etc. are not limited here, and the display device mentioned in the following embodiments is mainly used to realize the display function of the display screen.
  • the camera may be included in the image acquisition device, and the image acquisition device may also include a power supply for providing power to the camera, a wired/wireless communication module for transmitting control instructions for the camera, etc., which are not limited here.
  • the image acquisition device mentioned in the following embodiments is mainly used to realize the shooting (image or video) function of the camera.
  • FIG. 2 is a schematic diagram of an application scenario of an embodiment of the present application, where the application scenario at least includes an image acquisition device, a display device, and an IMU data acquisition device.
  • the IMU data acquisition device is used to collect IMU data, and can be included in terminal devices with IMU, such as mobile phones, remote control devices (such as remote controls, handles, etc.), tablet computers, and wearable devices (such as smart watches, smart bracelets, etc. )Wait.
  • the display device is a picture output device, which can be included in a device with a display screen, such as a computer, a TV, a smart screen (or a large screen), etc.
  • the image acquisition device is used to collect image data (including one or more frames of image information, Or a video stream containing multiple frames of images, etc.), which can be a camera, such as a depth camera, a non-depth camera, or other image acquisition devices, which are not limited here.
  • the image capturing device and the display device may be integrated in the same device, such as a computer, a TV, a smart screen (or a large screen), and the like.
  • the electronic device for executing the asynchronous calibration process in the human-computer interaction method includes a processor, and the electronic device may have multiple implementations.
  • the electronic device may be a device including an IMU data acquisition device, and is connected to one or more devices including an image capture device and/or a display device in a wired/wireless manner; or, the electronic device may also be a device including a display screen equipment, which is connected to one or more devices including an image acquisition device and/or an IMU data acquisition device by wired/wireless means;
  • One or more devices including a display device and/or an IMU data acquisition device are connected; alternatively, the electronic device may also be other devices, that is, a device that does not include an image capture device, a display device, and an IMU data acquisition device (such as smart speakers, robot, server, computing center, etc.), connected with one or more devices including an image acquisition device, a display device and/or an IMU data acquisition device through a wired/wireless manner.
  • the electronic device may be a device including
  • a user interacts with a large screen by holding a mobile phone.
  • the mobile phone is a device that includes an IMU data acquisition device
  • the large screen is a device that includes a display device.
  • the mobile phone and the large screen respectively execute the relevant steps in the human-computer interaction method provided by the embodiments of the present application, so that the user can hold the mobile phone to make physical actions and drive the mobile phone to move, and the large screen responds to the trajectory of the mobile phone movement, and the corresponding response on the large screen can be realized.
  • the position shows the effect of the cursor.
  • the asynchronous calibration process in the human-computer interaction method can preferably be processed by the mobile phone. device executes.
  • the asynchronous calibration process may also be performed by a processor of a large screen, or performed by a processor of other devices (eg, a server, a computing center, etc.), which is not limited in this application.
  • the application scenarios of the human-computer interaction method provided by the embodiments of the present application also include but are not limited to: the user wears a watch/bracelet to interact with the large screen, the user wears a motion sensor to interact with the large screen, the user holds a remote control and the head wears interactive display devices. It should be understood that any combination of any device including an IMU data acquisition device and any device including a display device can use the human-computer interaction method provided by the embodiments of the present application to perform human-computer interaction.
  • the electronic device is a device equipped with an IMU data acquisition device, and the display device of the image acquisition device is integrated with another device (such as a computer, a TV, a smart screen, etc.) as an example, that is,
  • the electronic device includes the IMU data acquisition device, and communicates with another device through a wired/wireless connection to acquire image data acquired by the image acquisition device or processed by the image acquisition device according to the acquired image data.
  • the processing result enables the electronic device to obtain and send the control information for controlling the cursor in the display device to the display device through the implementation process of the human-computer interaction method provided by the present application.
  • the user's hand (including fingers, wrist, palm, palm, etc.) carries the device including the IMU data acquisition device in a handheld or wearable manner.
  • the user moves his hand in the shooting area of the image acquisition device, which drives the IMU data acquisition device to move.
  • the IMU data collection device collects IMU data
  • the image collection device collects image data
  • the electronic device obtains the IMU data and image data
  • performs fusion processing to obtain air mouse data, which can then be displayed on the display device based on the air mouse data. Display the cursor. Therefore, the user can interact with the display device in the air mouse mode by moving the hand. For example, the user can select, move, drag, zoom in, and click on interface elements in the display interface of the display device in the air.
  • initialization may be performed first, and the time difference may be obtained by calculation. Then, in the subsequent implementation of the human-computer interaction method provided by the embodiment of the present application, according to the time difference calculated in the initialization process, the data collected by the IMU data collection device and the data collected by the image collection device are aligned, so that the final displayed cursor is not in place, The track is accurate.
  • the above initialization process may be a process in which the user performs a specified physical action facing the display device.
  • the display screen displays the text "please face the screen” to prompt the user to adjust the position relative to the display screen, and the display screen displays the text "please draw a W curve” to prompt the user to make Specifying a body motion, the user moves the arm, drawing a W curve in the air with the arm. Therefore, the watch processor can calculate the inherent time difference between the IMU and the camera hardware according to the IMU data collected during the user's drawing of the W curve and the image data collected by the camera (or the recognition result obtained based on the image data).
  • the user carries an electronic device with an IMU data acquisition device by hand or wears it, and when the user is in the shooting area of the image acquisition device, the person who enters (or disconnects and reconnects) the display device for the first time.
  • the electronic device can realize the initialization process through the human-computer interaction method shown in the following Figure 3, so as to align the time information (also called time stamp or time axis) between the IMU data acquisition device and the image acquisition device.
  • the implementation shown in FIG. 3 can be used to avoid problems such as inaccurate calibration caused by the time difference between data collected by different devices in the aforementioned second and third modes, resulting in misplaced cursor display.
  • FIG. 3 is a schematic flowchart of a human-computer interaction method according to an embodiment of the present application. The method includes the following steps.
  • the electronic device in the first duration corresponding to the initialization process, when the user carries the electronic device equipped with the IMU data acquisition device in a handheld or wearable manner, and the user is in the shooting area of the image acquisition device, the electronic device passes the IMU data
  • the acquisition device acquires the initialized IMU data by acquiring the user's limb movements within the first time period, and the image acquisition device acquires the initialization image data by acquiring the user's limb motion in the first period of time.
  • the electronic device may receive the initialization image data through wired/wireless communication with the image acquisition device.
  • the sampling frequencies of the two devices, the IMU data acquisition device and the image acquisition device may be the same, for example, the sampling frequencies are both 100 hertz (Hz), that is, in step S101, the two devices are in IMU data of 100 moments and image data of 100 moments can be collected within any second in the first duration.
  • the sampling frequencies of the IMU data acquisition device and the image acquisition device may also be different, for example, the sampling frequencies of the IMU data acquisition device are both 100 hertz (Hz) and the sampling frequency of the image acquisition device is 5 Hz, that is, in step S101 , the two devices can acquire the IMU data of 100 moments and the image data of 5 moments within any second of the first duration.
  • S102 Determine the difference between the timestamp of the IMU data acquisition device and the timestamp of the image acquisition device according to the initialization IMU data and the initialization image data.
  • the electronic device determines the difference between the time stamp of the IMU data acquisition device and the time stamp of the image acquisition device according to the initialized IMU data and the initialized image data obtained in step S101 in step S102, after the initialization process , the time stamps of the IMU data acquisition device and the image acquisition device can be aligned according to the difference.
  • the electronic device may analyze the signal characteristics of the initialization image data and the initialization IMU data according to the initialization IMU data and the initialization image data, and may calculate the fluctuation frequency to determine the difference value,
  • the difference may also be determined by regression linear fitting, which is not limited here.
  • the electronic device determines the difference by calculating the fluctuation frequency in step S102 as an example for description.
  • the electronic device may obtain the IMU data fluctuation curve within the first time period according to the initialized IMU data, and determine the peak position (and/or the wave trough position) in the IMU data fluctuation curve;
  • the CV key point detection result determines the CV data fluctuation curve within the first time period, and determines the peak position (and/or the wave trough position) in the CV data fluctuation curve, and the CV key point detection result includes the user holding or wearing the electronic device.
  • the three-dimensional orientation angle of the device's location eg, left wrist, right wrist, etc.).
  • the electronic device compares the time information of the peak position (and/or the trough position) in the IMU data fluctuation curve with the time information of the peak position (and/or the trough position) in the CV data fluctuation curve, and obtains the difference in time information.
  • the value is the difference between the timestamp of the IMU data acquisition device and the timestamp of the image acquisition device. Therefore, after step S102, the IMU data collected by the IMU data collection device and the image data of the image collection device can be aligned according to the difference.
  • the image acquisition device may perform CV key point recognition on the initialization image to obtain the CV key point detection result, and send the CV key point detection result to the electronic device in step S101, or, image acquisition The apparatus sends the initialization image to the electronic device in step S101, so that the electronic device performs CV key point recognition on the initialization image to obtain a CV key point detection result, which is not limited herein.
  • the electronic device may determine that the user enters the initialization process based on a variety of different triggering methods, and executes steps S101 and S102, For example, the initialization process may be triggered based on the user's specific limb movements (such as left swipe, right swipe, etc.) acquired by the IMU data acquisition device, or may be triggered when the user is in the shooting area of the image acquisition device, or the image acquisition It is triggered when communication is established between the device and the IMU data acquisition device, or other triggering methods, which are not limited here.
  • the initialization process may be triggered based on the user's specific limb movements (such as left swipe, right swipe, etc.) acquired by the IMU data acquisition device, or may be triggered when the user is in the shooting area of the image acquisition device, or the image acquisition It is triggered when communication is established between the device and the IMU data acquisition device, or other triggering methods, which are not limited here.
  • the electronic device may further determine initial relative information between the user and the image acquisition device according to the initialized IMU data and the initialized image data, where the initial relative information may include parameters such as distance and orientation .
  • the CV key point identification process can obtain the shoulder width through the two CV key points of the left shoulder and the right shoulder, which is used to determine the distance between the user and the image acquisition device, and the distance between the user and the image acquisition device. relative orientation.
  • the distance between the user and the image capture device and the relative orientation between the user and the image capture device are determined by using facial key points, such as eye spacing, ear spacing and other parameters.
  • the electronic device can obtain a certain proportional relationship according to any frame of images collected during the initialization process. For example, the default body rotation will not cause the head width to change.
  • the ratio of the user's head width to the shoulder width is A preset proportional coefficient, and the subsequent proportional coefficient changes according to the change of shoulder width, resulting in the change of orientation.
  • the determination of the relative initialization information can be achieved by the electronic device executing the implementation process shown in step S101.
  • the user's hands (including fingers, wrists, palms, palms, etc.) carry the equipment equipped with the IMU data acquisition device by holding or wearing, and the user is in the shooting area of the image acquisition device.
  • the video stream includes multiple frames of user images, and thereafter, CV key point detection is performed on the multiple frames of user images captured by the image acquisition device to obtain initial relative information.
  • the initial relative information can be used as a calibration reference value for relative information (such as distance and orientation), and at any time after the initialization process, if the acquired relative information is different from the calibration reference value (such as due to the user walking or turning around, etc.) trigger), it is determined that the relative information (such as distance and orientation) between the user and the image capture device has changed.
  • relative information such as distance and orientation
  • the electronic device provided with the IMU data acquisition device as a wearable device including a smart watch including an IMU as an example.
  • the user wears the smart watch and stands directly in front of the display screen.
  • the image acquisition device captures the gesture video stream data (including at least the first one). image and second image), obtain the user's initial state standing distance, body orientation initialized to the front, shoulder width, head size and other parameters through CV key point detection as the initial image data, and synchronously align the IMU corresponding to the preset actions collected by the IMU data.
  • the initialization process can be implemented by the example shown in FIG. 4 .
  • the initialization process includes: the display screen displays “please face the screen” through the interface, and when the image capture device detects that the user is facing the screen, it can remind the user to perform the initialization operation, That is, "please draw the W curve" is displayed on the interface; after that, after the image acquisition device detects that the user has drawn the W curve in the air, the image acquisition device acquires the initial image data of the user in the process of drawing the W curve, and the IMU data acquisition device The user's IMU data is collected, and the initial image data and the IMU data are aligned to initialize the cursor position.
  • the initialization process can be implemented by the example shown in FIG. 5 , and the initialization process includes: when the image acquisition device detects the user, the user faces the screen by default, and then the user can be reminded to perform the initialization operation, and the display screen displays “Please "Draw W curve facing the screen"; after that, after the image acquisition device detects that the user has drawn the W curve in the air, the image acquisition device collects the initial image data of the user in the process of drawing the W curve, and the IMU data acquisition device collects the user's IMU data, and align this initial image data with the IMU data to initialize the cursor position.
  • the electronic device can realize the determination of the initialization relative information without performing the implementation process shown in step S101.
  • the user stands in front of the screen of the display device, boots into the air mouse mode, and uses a monocular image acquisition device (such as a monocular camera) ranging technology, which can specifically calculate the actual coordinates of the corresponding pixel based on a similar triangular ratio, and initialize the model. Parameters such as distance and body shape are aligned with the IMU measurement data of the watch and the initial image data of the display device.
  • a monocular image acquisition device such as a monocular camera
  • the user can enter the air mouse mode without initializing gestures.
  • the monocular image acquisition device ranging technology has relatively high requirements for camera calibration, and requires that the distortion caused by the lens itself is relatively small, but in general this method can be used.
  • the portability and practicability are strong, and the method can also achieve accurate estimation of initialization parameters.
  • the accuracy of the ranging may be sacrificed without the initialization process, but the depth image acquisition device can be used to make up for the accuracy.
  • the monocular image acquisition device is directly modeled in the initialization stage, and the user's position distance and body shape parameters (ie, orientation) are obtained through the ranging principle of the monocular camera.
  • the user's position distance and body shape parameters ie, orientation
  • the electronic device can execute the human-computer interaction method shown in FIG. 6 to achieve asynchronous calibration. Controlling the cursor in the display device can improve the continuity of the cursor movement in the display device, thereby improving user experience.
  • the electronic device can also directly execute the human-computer interaction method shown in FIG. 6 without going through the initialization process shown in FIG. 3 .
  • the time difference between the two devices for collecting/processing data may be at the microsecond level or even lower, and the level of the time difference is too low for the user to perceive during the human-computer interaction process.
  • the cursor display effect caused by this time difference Therefore, in the human-computer interaction method shown in FIG. 6, the cursor in the display device can be controlled by asynchronous calibration without the initialization process shown in FIG. 3, and the coherence of the cursor movement in the display device can be improved. to enhance the user experience.
  • FIG. 6 is a schematic flowchart of a human-computer interaction method according to an embodiment of the present application. The method includes the following steps.
  • the electronic device determines initial IMU data.
  • the electronic device collects the user through the IMU data acquisition device in the first set of moments Body movements get initial IMU data.
  • the first time set is included in the first time period.
  • the sampling frequency of the IMU data acquisition device is the first sampling frequency (for example, 100 Hz), and the multiple times included in the first time set are the first time period. 100 moments in every second.
  • the electronic device including the IMU data acquisition device is used as an example for description.
  • the electronic device is a mobile phone, a remote control device (such as a remote control, a handle, etc.), a tablet computer, a wearable device (such as smart watches, smart bracelets, etc.)
  • step S201 when the user carries the electronic device equipped with the IMU data collection device by hand or wears it, and in the process of performing the air mouse operation, the IMU continuously tracks the change of the user's gesture, through the main components included in the IMU
  • the gyroscope, accelerometer, magnetometer, etc. record the IMU data, and collect the user's limb movements in the first time set to obtain the initial IMU data, so that the electronic device can obtain and determine the initial IMU data in step S201.
  • the initial IMU data obtained in step S201 may be obtained after processing the IMU data recorded by the IMU data acquisition device through waveform smoothing processing, de-drying calibration compensation processing, or other methods. Initial IMU data, not limited here.
  • the electronic device determines the first image data.
  • the image acquisition device collects the user's limb movements in the second set of moments to obtain an image
  • the image data may include one or more frames of image information, or a video stream containing multiple frames of images, and the like.
  • the electronic device may obtain the first image data through wired/wireless communication connection with the device including the image acquisition device.
  • the second time set is included in the first time period.
  • the sampling frequency of the image acquisition device is the second sampling frequency (for example, 5 Hz)
  • the multiple moments included in the second time set are within the first time period. 5 moments in every second.
  • the calculation time of the CV recognition process is generally much longer than the processing time of the IMU data.
  • CV recognition Each calculation process of the IMU generally takes several hundred milliseconds, while the processing time of each IMU data is generally several milliseconds or ten milliseconds, and the difference between the two is at least an order of magnitude. Therefore, the second time set corresponding to the image data collected by the image acquisition device in step S202 may be a subset of the first time set corresponding to the initial IMU data collected by the IMU data acquisition device in step S201.
  • the processing time of each IMU data is 10 milliseconds, and the processing time of each CV identification process is 200 milliseconds as an example.
  • steps S201 and S202 the user carries the device equipped with the IMU data acquisition device by hand or wearing, and the user is within a certain second time interval during which the air mouse operation is performed in the shooting area of the image acquisition device.
  • the interval is denoted as (0, 1000], and the unit of time interval described here and in the subsequent description is milliseconds.
  • the first time set for collecting initial IMU data in step S201 is (10, 20, 30... 200, 210, 220... 400... 600... 800...
  • the second time set for collecting the first image data in step S202 is (200, 400, 600, 800, 1000) a total of 5 times.
  • the initial IMU data collected based on 100 times may be asynchronously calibrated based on the first image data collected at 5 times, so as to avoid cursors that exist when only the IMU data is used for human-computer interaction. Drift and other issues (see the description of the above-mentioned way 1).
  • the image acquisition device may detect that the user holds or wears the IMU data acquisition device and the user's position is in the shooting area of the image acquisition device, or the image acquisition device responds to the user's voice Wake up, or, in response to a user's operation on the electronic device, the image acquisition apparatus performs a process of acquiring image data, which is not limited here.
  • the number of cameras included in the image capture device may be set to one, that is, the image capture device acquires image data through a single camera, or the number of cameras included in the image capture device may also be set to multiple, that is, the image capture device uses Image data is acquired by multiple cameras, which is not limited here.
  • the image acquisition device includes multiple cameras, different cameras can be used to cover scenes of different ranges, and the problem that the cameras cannot switch the focal length back and forth can be solved. ranging from multiple sets of image data to improve ranging accuracy.
  • the image acquisition device includes one camera, compared with the arrangement of multiple cameras, the hardware settings of the camera can be saved, and the calculation amount of image data can be reduced to improve the subsequent cursor response speed.
  • FIG. 7 is an implementation example of setting the positional relationship between the image capturing device and the display device in the electronic device.
  • the image capture device may be set outside the display area of the display device.
  • the image capture device is set at a position close to the upper frame of the display device.
  • the image capture device is set at a position close to the upper frame of the display device.
  • the image acquisition device is set at a position close to the lower border of the display device, or the image acquisition device is set at other positions of the display device, such as a position close to the left border of the display device, a position close to the right border of the display device, or a position close to the display device.
  • the positions of the upper left corner, upper right corner, lower left corner, and lower right corner of the device are not limited here.
  • the image capturing device may also be arranged within the display area of the display device. For example, as shown in (c) of FIG.
  • the image acquisition device is set at the middle position of the display area of the display device, or the image acquisition device is set at other positions of the display device, such as a position close to other borders in the display area of the display device, which is not limited here. .
  • the electronic device performs CV key point recognition according to the first image data to obtain a first constraint condition
  • the electronic device performs CV key point recognition according to the first image data determined in step S202 to obtain the first constraint condition.
  • the electronic device uses the human body skeleton recognition technology to perform CV key point recognition on the human body included in the image data, and determines the obtained recognition result as the first constraint condition , for example, the CV identification result may include the three-dimensional space azimuth information of the CV key points.
  • the CV key point identification and positioning it can be based on the number of CV key points being 9, the number of CV key points being 14, the number of CV key points being 16, the number of CV key points being 21, or other CV key points.
  • the implementation of the number of points is not limited here. Illustratively, take the implementation process in which the number of CV key points is 9 as shown in FIG.
  • the CV key point at least includes the position where the user holds or wears the electronic device containing the IMU data acquisition device, such as the user's left elbow, left wrist, right elbow, right wrist, left hand finger, right finger, etc. or other
  • the CV key points of can be adjusted according to specific application scenarios, which are not limited here.
  • the CV key point identification process may also be performed by a device including an image capture device to obtain the first constraint condition, that is, in step S202, the device including an image capture device may also The first constraint condition is sent to the electronic device.
  • the electronic device does not need to perform the CV key point identification process, which can reduce the processing delay of the electronic device and improve the response speed of the electronic device.
  • the input image data (for example, the aforementioned The first image data) is processed to obtain the first constraint condition.
  • the processing efficiency can be greatly improved, and the response speed of the subsequent cursor in the display device can be further improved.
  • the preset neural network model may be obtained by training a training sample, wherein the training sample may include image data and label data, and the label data may be the CV key point coordinates corresponding to the image data, or,
  • the label data may be a constraint condition corresponding to the image data (such as a three-dimensional space orientation angle), or the label data may be a CV key point coordinate corresponding to the image data and a constraint condition (such as a three-dimensional space orientation angle) corresponding to the image data.
  • the training process can be performed locally by the electronic device, or locally by the device including the image capture device, or by the cloud server and then transmitted to the electronic device or the device including the image capture device by means of data transmission, There is no limitation here.
  • step S202 the user carries the equipment with the IMU data acquisition device by hand-held or wearing, and the user is in the shooting area of the image acquisition device and performs the air mouse operation within a certain second.
  • the realization process Described as an example.
  • the set of second moments when the image acquisition device collects the first image data is (200, 400, 600, 800, 1000) a total of 5 moments, and the electronic device performs CV on the image data corresponding to these five moments in step S203.
  • Key point identification for example, the user wears a watch containing an IMU data acquisition device on the right wrist
  • the positioning coordinates of the right wrist corresponding to the operation and determine the movement direction of the right wrist according to the chronological order of 5 moments to determine the azimuth angle in three-dimensional space of the right wrist (or the arm where the right wrist is located), and determine the three-dimensional space direction angle as the first constraint.
  • the human skeleton recognition technology used by the electronic device may be CV key point recognition using a three-dimensional (3-dimensional, 3D) human skeleton recognition technology (for example, in step S202, the image acquisition device uses a monocular camera to acquire image), or a two-dimensional (2-dimensional, 2D) human skeleton recognition technology (for example, when the image acquisition device in step S202 acquires images through a multi-camera camera), which is not limited here.
  • the CV key points used by the electronic device to determine the first constraint condition in step S203 may be 2D human body CV key points or 3D human body CV key points, which are not limited here.
  • the electronic device can acquire the wearing position information of the device including the IMU data acquisition device, so that the electronic device can determine which CV key point among the multiple CV key points can be used as the first constraint condition.
  • the device including the IMU data acquisition device is a watch, and the electronic device is also a watch.
  • the user wears the watch on the right wrist, and the watch can sense whether the user wears the watch on the left wrist or the right wrist.
  • the CV key points include 6 CV key points including left shoulder, right shoulder, left elbow, right elbow, left wrist and right wrist
  • the watch can determine the 3D space azimuth information of the CV key point of the right wrist as the first constraint according to the wearing position information condition.
  • the electronic device calibrates the IMU data based on the first constraint condition to obtain target IMU data
  • the electronic device calibrates the IMU data determined in step S201 according to the first constraint condition determined in step S203 to obtain target IMU data.
  • the electronic device processes the IMU data recorded by the three sensors of the gyroscope, accelerometer and magnetometer in the initial IMU data determined in step S201 to obtain the attitude angle information through the attitude calculation algorithm, and based on the first obtained in step S203 A constraint condition is used to calibrate the recorded IMU data (or the attitude angle information is calibrated based on the first constraint condition obtained in step S203, or the recorded data is calibrated based on the first constraint condition obtained in step S203.
  • the IMU data and the attitude angle information are calibrated) to obtain the target IMU data.
  • the attitude calculation algorithm may include a mahony algorithm, a Kalman filter algorithm, and the like.
  • the user carries the equipment with the IMU data acquisition device by hand-held or wearing, and the user is in the shooting area of the image acquisition device and performs the air mouse operation within a certain second.
  • the realization process Described as an example.
  • the first time set of the initial IMU data collected by the IMU data collection device in step S201 is (10, 20, 30...200, 210, 220...400...600...800...980,990 , 1000) a total of 100 moments
  • the second moment set of the first image data collected by the image acquisition device in step S202 is (200, 400, 600, 800, 1000) a total of 5 moments
  • the electronic device in step S203 corresponds to the five moments respectively
  • the image data is used for CV key point recognition (for example, the user wears a watch containing an IMU data acquisition device on the right wrist), and the first constraint condition containing the positioning coordinates in these five moments is obtained, and according to the positioning coordinates in the five moments
  • the asynchronous calibration is performed in the first time set respectively, that is, the attitude angle information corresponding to the IMU data in the first time set (200, 400, 600, 800, 1000) is calibrated according to the positioning coordinates in the five time sets, and the calibrated The IMU data of 100 moments is the target IMU
  • the first constraint condition is the three-dimensional space direction angle of the arm indicated by the positioning coordinates in 5 times
  • the initial IMU data is the IMU data at 100 times. That is, in step S204, the electronic device may calibrate the three-dimensional space direction angle of the arm indicated by the positioning coordinates in 5 moments (in the first constraint) to the IMU data at 100 moments (in the initial IMU data) , get the target IMU data.
  • asynchronous calibration is performed on the initial IMU data at many times through the first constraint formed by the CV identification of the image data at a small amount of time. Compared with the synchronous calibration method (for example, in the aforementioned method 3), there is no need to wait for a long time for CV identification.
  • the control information obtained from the target IMU data obtained by calibration can be used to control the cursor in the display device, so that the refresh rate of the cursor in the display device can be the same as the frame rate of the IMU data collection, and in the asynchronous calibration method Limited by the processing frequency of CV recognition, the refresh frequency of the cursor can be increased, and at the same time, problems such as cursor display freeze and display delay can be avoided, and the user experience can be improved.
  • the electronic device may calibrate the IMU data recorded by the sensor in the initial IMU data according to the first constraint condition in step S204 to obtain the calibration result, and then process the calibration result through the attitude settlement algorithm , and use the attitude angle information obtained by processing as the target IMU data.
  • the electronic device performs mapping processing according to the three-dimensional space orientation angle of the arm (at five moments) in the first constraint condition, and obtains the IMU calibration data corresponding to each arm three-dimensional space orientation angle (at five moments), and based on the obtained The (5 time) IMU calibration data are fitted to obtain the IMU calibration curve.
  • the initial IMU curve is obtained by fitting, and further, the weighted average processing is performed on the IMU calibration curve and the initial IMU curve to obtain the optimized curve. , and read the corresponding (100 time) calibrated IMU data from the optimized curve.
  • the calibrated IMU data is processed by an attitude calculation algorithm to obtain attitude angle information (at 100 times), and the attitude angle information (at 100 times) is used as the target IMU data obtained in step S204.
  • the electronic device may first obtain the attitude angle information according to the IMU data recorded by the sensor in the initial IMU data through the attitude calculation algorithm, and then adjust the attitude angle information according to the first constraint condition.
  • the attitude angle information is calibrated, and the calibrated attitude angle information is used as the target IMU data.
  • the electronic device performs regression processing according to the three-dimensional space orientation angle of the arm (at five moments) in the first constraint condition, and obtains the attitude angle information corresponding to each arm three-dimensional space orientation angle (at five moments), and based on the obtained
  • the attitude angle calibration curve is obtained by fitting the attitude angle information of (5 moments).
  • the attitude angle information (5 moments) is obtained through the attitude calculation algorithm, and the attitude angle information (5 moments) is fitted and processed. , get the attitude angle change curve.
  • attitude angle information (of 100 moments) is obtained as step S204
  • the target IMU data obtained in .
  • the user carries a wearable device (including an IMU) on his hand, and performs user body movements in front of the device with a display screen and a camera.
  • the wearable device obtains the IMU data generated by the user's body movements, and at the same time,
  • the camera acquires the image data generated by the user's body movements, and calibrates the IMU data according to the image data, so as to realize the scene of the mouse movement operation on the coordinate position on the display screen.
  • the moving direction of the wearable device triggered by the user's physical action is indicated by a dashed arrow
  • the coordinate displacement of the coordinate position in the display screen is indicated by a solid arrow.
  • the position of the camera on the display screen is fixed (for example, when it is set above the middle axis of the display screen), when the user is at position 0, the user's limb movements move within a certain angular range, so that the coordinates mapped on the display screen Move from A to B; when the user is in position 1, the user's body movement moves the same angle range, so that the coordinates mapped on the display screen move from C to D, but because the relative direction of the camera and the user changes, the user's body movement Even if the movement moves within the same angular range, the changes of the coordinates mapped on the display will still be different (that is, the distance between the two points AB is not equal to the distance between the two points CD); similarly, when the user is at position 2, the user's limbs move Move the same angle range, so that the coordinates mapped on the display screen move from E to F.
  • the relative direction between the camera and the user in position 2 is similar to the relative direction between the camera and the user in position 0, and the resulting display
  • the coordinate displacement changes mapped on the screen may be the same (that is, the distance between the two points CD is approximately equal to the distance between the two points EF), but since the position 2 is close to the right edge of the display screen, it is easy to cause the problem of cursor overflow as shown in the figure (that is, point F).
  • the coordinates are beyond the range of coordinates covered by the display area).
  • the cursor movement path corresponding to the user's body movement displayed on the display screen will also deviate.
  • the user when the user is in a standing posture with the front of the body facing the camera and the side of the body facing the camera, even if the user's limbs perform the same action, the resulting movement paths of the cursor on the display screen are different.
  • the user's arm moves from a position naturally perpendicular to one side of the body to a position parallel to the ground on the same plane as the torso using the shoulder joint as the axis.
  • the cursor movement path on the screen may be an arc.
  • the resulting movement path of the cursor on the display screen may be a straight line. That is, when the position and orientation of the human body relative to the display screen changes, the spatial displacement and the angle change of the human body torso coordinate system cannot be tracked, resulting in inaccurate positioning and cursor overflow.
  • the implementation can be further optimized in step S204, which will be described in detail below.
  • the electronic device calibrates the initial IMU data based on the first constraint condition
  • the process of obtaining the target IMU data may specifically include: the electronic device firstly obtains the first image data according to the first image data obtained in step S202. Determine a first human arm engineering model, the first human arm engineering model includes at least one first value range of the rotation direction of the limb, after that, the first constraint condition is updated based on the first human arm engineering model to obtain the updated No. A constraint condition, and further use the updated first constraint condition to calibrate the initial IMU data to obtain target IMU data.
  • the electronic device can also determine a first human arm engineering model according to the first image data obtained in step S202, where the first human arm engineering model includes at least one first value range of the rotation direction of the limb, that is, for the user's personalization It is easy to get tired and constructs the first human arm engineering model with minimum work and minimum torque change.
  • the user in the process of human-computer interaction, the user generally does not make limb movements that violate ergonomics. is [-25, 30°]. If the wrist joint movement beyond this range is detected, it can be considered that the identification of the IMU data acquisition device or image acquisition device is incorrect. Therefore, the movable range of different limbs of the user can be determined. Build the ergonomic model of the human arm.
  • the user's fatigue letter may indicate the gorilla arm effect.
  • the user's elbow joint has a larger angle change than the shoulder joint, and the movement angle of the shoulder joint may be temporarily larger in the first few minutes. range, but soon the movement angle of the shoulder joint is reduced to within 30 degrees.
  • the minimum work and minimum torque changes are also specific to the phenomenon that the movement at the end of the rigid cylinder consumes less work and torque than the movement at the root of the cylinder by the same angle.
  • the target IMU data is determined based on the first ergonomic model of the human arm and the first constraint.
  • the first constraint condition determined in step S203 (for example, the aforementioned three-dimensional space direction angle) can be input into the human arm engineering model to form a ternary inequality about the attitude angle, and based on the attitude angle of the ternary
  • the inequality calibrates the initial IMU data to obtain the target IMU data in step S204.
  • the human arm engineering model is used as one of the constraints, which can effectively avoid the inaccurate positioning in the display screen and the overflow of the cursor.
  • an implementation example of an ergonomic model of a human arm can be implemented in FIG. 10 .
  • the user's limb is simulated as a four-axis rigid cylinder human arm engineering model.
  • ⁇ 1 indicates the user's shoulder
  • ⁇ 2 indicates the user's shoulder joint
  • ⁇ 3 indicates the user
  • ⁇ 4 indicates the user's wrist joint.
  • each rigid cylinder corresponding to each user limb has different movable distances/angles in different degrees of freedom
  • the first ergonomic model of the human arm includes at least one first range of rotation directions of the limb.
  • the degrees of freedom of the different limbs of the user can be exemplarily represented as pitch, yaw, and yaw.
  • the physical movements of the different limbs of the user can be performed based on these three degrees of freedom. express.
  • the adduction or abduction action of the user's shoulder joint can be represented by pitch
  • the forward bending or backward extension action of the user's shoulder joint can be represented by roll
  • the internal rotation or external rotation action of the user's shoulder joint can be represented by yaw.
  • extension and flexion action of the user's elbow joint can be represented by pitch
  • rotation action of the user's elbow joint can be represented by roll
  • extension and flexion action of the user's wrist joint can be represented by pitch
  • ulnar deflection action of the user's wrist joint can be represented by roll
  • the device including the IMU data collection device is taken as an example of a watch, and the watch is generally worn on the user's wrist joint.
  • the first range of limb movements of the user's wrist joint can be obtained, including extension and flexion [-35°, 50°] and ulnar deviation [-25°, 30°].
  • the first constraint condition obtained by the image acquisition device indicates that the posture angle of the user's wrist joint is an extension and flexion angle
  • the extension and flexion angle can be input into the first human arm engineering model to form a ternary inequality about the extension and flexion angle. Expressed as:
  • min indicates the minimum value of the extension and flexion action in the first value range, that is -35°
  • pitch is the value of the extension and flexion angle corresponding to this degree of freedom
  • max indicates the maximum value of the extension and flexion action in the first value range, that is, 50°.
  • the first constraint condition indicates that the posture angle of the user's wrist joint is an extension and flexion angle that exceeds the first value range
  • the first constraint condition can be updated based on the first value range, and the excess The value is updated to the minimum or maximum value in the first range (ie -35° or 50°).
  • the pitch in the ternary inequality can be replaced by other degrees of freedom, which can be flexibly implemented according to specific application scenarios, which is not limited here.
  • the electronic device performs the initialization process shown in FIG. 3 , it can perform calculations according to the initialization image data obtained in step S101 in FIG. 3 to establish an initial human arm engineering model, and the user and the image capture When the relative information between the devices changes, the initial human arm engineering model is updated to obtain the above-mentioned first human arm engineering model.
  • the first duration of the user initialization process performed by the electronic device in FIG. 3 may include a third time set, wherein, in the second time set after the third time set, the electronic device performs the first image data in the first image data.
  • CV key point identification, relevant parameters can be obtained to determine the first relative information, and when the first relative information is different from the initial relative information, the electronic device will be based on the first relative information.
  • the initial human arm engineering model is updated to obtain the first human arm engineering model.
  • the overall coordinate value in the initial ergonomic model of the human arm can be offset and corrected according to the direction indicated by the difference. , to obtain the first ergonomic model of the human arm.
  • the third time set is located before the first time period.
  • the electronic device can also determine the initial image data collected by the image acquisition device in the third time set before the second time set, and construct the initial human arm engineering model according to the initial image data.
  • the initial relative information is different from the first relative information due to the user walking or the user turning around, that is, the relative information collected by the user (such as the user's torso or the user's body) and the image acquisition device at the second moment is different from the user (such as the user's torso or the user's body)
  • the first relative information to update the initial ergonomic model of the human arm, obtain the first ergonomic model of the human arm, and further optimize the cursor control .
  • the calculation is performed according to the initialization image data to establish the implementation process of the initial human arm engineering model, which is similar to the aforementioned implementation process of determining the first human arm engineering model according to the first image data obtained in step S202. , and will not be repeated here.
  • the initial human arm ergonomic model is updated, and the scene of obtaining the above-mentioned first ergonomic arm ergonomic model is introduced.
  • the electronic device can determine the first relative information between the user and the image acquisition device according to the first image data, and when the first relative information is different from the initial relative information, trigger the execution of the above-mentioned updating process, so as to obtain the first relative information.
  • Human arm ergonomic model For example, the shoulder joint includes at least two degrees of freedom, extension and flexion, abduction and adduction, and the initial relative information indicates that the user is facing the image acquisition device as an example.
  • the initial relative information can establish an initial human arm engineering model , and the first value range of the initial human arm engineering model includes the parameters on the two degrees of freedom of the shoulder joint, including the movement range of abduction and adduction.
  • the range of movement range of this degree of freedom can be [0, 0°]; taking the first relative information indicating that the user is facing 90 degrees laterally facing the image acquisition device as an example, at this time, the first relative information can be used to update the
  • the initial human arm engineering model is to perform translation/rotation operations on the coordinates of the initial human arm engineering model according to the difference between the initial relative information and the first relative information to obtain the updated first human arm engineering model.
  • the second value range of the first human arm ergonomic model also includes parameters on the two degrees of freedom of the shoulder joint, and the range of movement range including the abduction and adduction degree of freedom may be [0, 0° ], the range of movement range of this degree of freedom can be [0, 90°].
  • the electronic device determines the initial human arm engineering model as the first human arm engineering model.
  • the initial relative information is the same as the first relative information, that is, when the relative information gathered by the user and the image acquisition device at the second moment is not changed compared with the relative information gathered by the user and the image acquisition device at the third moment
  • the initial human arm engineering model can be determined as the first human arm engineering model, and there is no need to update the human arm engineering model, thereby improving processing efficiency.
  • the electronic device performs coordinate transformation processing on the target IMU data to obtain control information.
  • the electronic device performs coordinate conversion processing according to the target IMU data obtained in step S204 to obtain control information, where the control information is used to control the cursor in the display device.
  • the control information is used to control the cursor in the display device to perform related operations, such as moving, dragging, zooming in, clicking, and the like.
  • the control information when the control information is used to control the cursor in the display device to move, the control information may be specific coordinate information (X, Y) of the cursor on the two-dimensional display plane of the display device.
  • the control information when the control information is used to control the cursor in the display device to perform dragging, zooming, and clicking, the control information may be an identifier corresponding to the corresponding gesture action (for example, performing dragging corresponds to marker 1, performing zooming corresponding to marker 2, performing click Corresponding identification 3, etc.); wherein, in step 105, the electronic device can determine whether the target IMU data is the gesture action of the corresponding category according to the preset neural network classifier, and if so, the identification of the corresponding gesture action is used as control information, to Implements related operations on the cursor in the display device.
  • step S205 the electronic device performs coordinate transformation processing on the target IMU data
  • the process of obtaining the control information may include: the electronic device determines the user (for example, the user's torso or the user's torso) according to the first image data. body) and the first relative information between the image acquisition device, and determine the first mapping relationship of the user (such as the user's torso or the user's body) in the display device according to the first relative information, and then determine the first mapping relationship according to the first relative information.
  • a mapping relationship is performed on the target IMU data to perform coordinate transformation processing to obtain the control information.
  • the first relative information includes parameters such as distance and station orientation, and the implementation process may refer to the content in step S204, which will not be repeated here.
  • the relative information between the user and the image capturing device may change. Therefore, according to the first relative information determined by the first image data, the first mapping relationship of the user (for example, the user's torso or the user's body) in the display device can be further determined, and the first mapping relationship can be used as the processing of control information Based on this, problems such as inaccurate positioning and cursor overflow caused when the relative information is changed are avoided.
  • the first mapping relationship of the user for example, the user's torso or the user's body
  • the electronic device performs the initialization process shown in FIG. 3 , it can perform calculation according to the initialization image data obtained in step S101 in FIG.
  • the initial mapping relationship in the device is updated, and the above-mentioned first initial mapping relationship is obtained only when the relative information between the user and the image capturing device changes.
  • the electronic device may further determine the initial mapping relationship of the user in the display device according to the initial image data. Thereafter, when the initial relative information is different from the first relative information, that is, when the relative information collected by the user and the image capture device at the second time is compared with the relative information collected by the user and the image capture device at the third time, use The first relative information updates the initial mapping relationship to obtain the first mapping relationship, so as to further optimize the cursor control.
  • the electronic device determines the initial relative information as the first mapping relationship.
  • the initial relative information is the same as the first relative information, that is, the relative information gathered by the user (for example, the user's torso or the user's body) and the image acquisition device at the second moment and the relative information gathered by the user and the image acquisition device at the third moment
  • the initial mapping relationship can be determined as the first mapping relationship, and there is no need to update the mapping relationship, thereby improving processing efficiency.
  • the electronic device uses the image acquisition device to collect the first image data obtained by the user's limb movements in the second time set to perform CV key point recognition, and obtain the first constraint condition. , based on the first constraint condition, calibrate the initial IMU data obtained by the IMU data acquisition device collecting the user's limb movements in the first time set to obtain the target IMU data, and then perform coordinate transformation processing based on the target IMU data to obtain the target IMU data. Control information for controlling the cursor in the display device.
  • the second time set is a subset of the first time set, that is, the process in which the electronic device calibrates the initial IMU data to obtain the target IMU data is asynchronous calibration.
  • the calculation time of the CV identification process is generally much longer than the processing time of the IMU data.
  • the asynchronous calibration implementation method does not need to wait for a long calculation time.
  • the long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
  • an embodiment of the present application further provides a human-computer interaction method, and the human-computer interaction method can be performed by one or more electronic devices, wherein the method can include the following module settings.
  • a determining image data module 1101 configured to determine and output image data, where the image data includes at least the first image data, corresponding to the implementation process in the aforementioned step S202;
  • the CV key point recognition module 1102 is used to perform CV key point recognition on the image data output by the determining image data module 1101, and obtain and output the first constraint condition, which corresponds to the implementation process of the aforementioned step S203;
  • An asynchronous calibration module 1104 configured to perform calibration processing at least according to the first constraint condition and the initial IMU data, and obtain and output the target IMU data, corresponding to the implementation process in the aforementioned step S204;
  • the coordinate conversion module 1105 is configured to perform coordinate conversion processing at least according to the target IMU data, and obtain and output control information, which corresponds to the implementation process in the foregoing step S205.
  • the electronic device shown in FIG. 11 may further include other modules as follows.
  • a display device 1106, configured to control the cursor according to the control information
  • the preprocessing module 1107 is used to preprocess the initial IMU data, and output the preprocessed result to the asynchronous calibration module 1104, wherein the preprocessing process may include waveform smoothing processing, de-drying calibration compensation processing, etc.;
  • the human arm engineering model building module 1108 is used to construct a first human arm engineering model according to the first image data output by the image data module 1101 , and input the first human arm engineering model to the first constraint of the asynchronous calibration module 1104 The condition is updated as one of the basis for determining the target IMU data;
  • the human arm engineering model building module 1108 can also be used to construct an initial human arm engineering model according to the initial image data output by the determining image data module 1101, and input the initial human arm engineering model pair to the first step of the asynchronous calibration module 1104.
  • a constraint condition is updated;
  • the mapping relationship determination module 1109 is used to determine the first mapping relationship according to the first image data output by the image data module 1101, and output the first mapping relationship to the coordinate conversion module 1105 as one of the basis for coordinate conversion processing;
  • mapping relationship determination 1109 can also be used to determine the initial mapping relationship according to the initial image data output by the determining image data module 1101, and output the initial mapping relationship to the coordinate conversion module 1105 as one of the basis for coordinate conversion processing;
  • the module 1110 for judging whether the relative information has changed is used to determine whether the first relative information indicated by the first image data and the initial relative information indicated by the initial image data have changed;
  • the module 1110 If it is changed, the judging whether the relative information is changed or not, the module 1110 outputs the judgment result to the human arm engineering model building module 1108, so that the human arm engineering model building module 1108 determines to output the first human arm engineering model to the asynchronous calibration module 1104;
  • the relative information change module 1110 outputs the judgment result to the mapping relationship determination module 1109, so that the mapping relationship determination module 1109 determines to output the first mapping relationship to the asynchronous calibration module 1104,
  • the judging whether the relative information has changed the module 1110 outputs the judgment result to the human arm engineering model building module 1108, so that the human arm engineering model building module 1108 determines to output the initial human arm engineering model to the asynchronous calibration module 1104;
  • the relative information change module 1110 outputs the judgment result to the mapping relationship determination module 1109 , so that the mapping relationship determination module 1109 determines to output the initial mapping relationship to the asynchronous calibration module 1104 .
  • an embodiment of the present application further provides a first electronic device 1200 , where the first electronic device 1200 may at least include a motion sensor 1201 and a processor 1202 .
  • the first electronic device 1200 may further include other components, such as a memory, a casing, a communication module, etc., which are not limited here.
  • the motion sensor 1201 can be used to implement the implementation process of the IMU data acquisition apparatus in any of the foregoing embodiments, and the processor 1202 can be used to perform the calculation, processing and other implementation processes in any of the foregoing embodiments, and achieve corresponding beneficial effects , will not be repeated here.
  • an embodiment of the present application further provides a second electronic device 1300 , where the second electronic device 1300 may at least include a camera 1301 and a display screen 1302 .
  • the second electronic device 1300 may also include other components, such as a memory, a casing, a communication module, etc., which are not limited here.
  • the camera 1301 can be used to implement the implementation process of the image acquisition device in any of the foregoing embodiments
  • the display screen 1302 can be used to implement the implementation process of the display device in any of the foregoing embodiments, and achieve corresponding beneficial effects, here Not to repeat them one by one.
  • the present application provides an electronic device, which is coupled to a memory for reading and executing instructions stored in the memory, so that the electronic device implements the electronic device in any of the foregoing embodiments in FIG. 3 to FIG. 11 .
  • the electronic device is a chip or a system on a chip.
  • the present application provides a chip system
  • the chip system includes a processor for supporting an electronic device to implement the functions involved in the above aspects, for example, for example, sending or processing the data and/or information involved in the above method.
  • the chip system further includes a memory for storing necessary program instructions and data.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the present application also provides a processor, which is coupled to the memory and configured to execute the methods and functions related to the electronic device in any of the foregoing embodiments.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, implements a method process related to an electronic device in any of the foregoing method embodiments.
  • the computer may be the above electronic device.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .
  • the words “if” or “if” as used herein may be interpreted as “at” or “when” or “in response to determining” or “in response to detecting.”
  • the phrases “if determined” or “if detected (the stated condition or event)” can be interpreted as “when determined” or “in response to determining” or “when detected (the stated condition or event),” depending on the context )” or “in response to detection (a stated condition or event)”.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • User Interface Of Digital Computer (AREA)
  • Position Input By Displaying (AREA)

Abstract

The embodiments of the present application provide a human-computer interaction method and a related device, which are used to control a cursor in a display apparatus by means of asynchronous calibration, which can improve the continuity of cursor movement in the display apparatus, thereby improving user experience. In the method, after a camera acquires obtained first image data on the basis of a second sampling frequency in a first time period, a first constraint condition is obtained by performing CV key point recognition on the first image data; a processor can, on the basis of the first constraint condition, calibrate obtained initial motion sensing data acquired on the basis of a first sampling frequency in the first time period, and obtain target motion sensing data; afterwards, the processor then further obtains, according to the target motion sensing data, control information for controlling a display screen. The second sampling frequency is less than the first sampling frequency, that is, the processor performs asynchronous calibration on the initial motion sensing data to obtain the target motion sensing data.

Description

一种人机交互方法及设备Human-computer interaction method and device
本申请要求于2021年04月30日提交中国国家知识产权局,申请号为202110486465.2,发明名称为“一种人机交互方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on April 30, 2021 with the application number 202110486465.2 and the title of the invention is "A Human-Computer Interaction Method and Device", the entire contents of which are incorporated by reference in in this application.
技术领域technical field
本申请涉及终端应用领域,尤其涉及一种人机交互方法及相关设备。The present application relates to the field of terminal applications, and in particular, to a human-computer interaction method and related equipment.
背景技术Background technique
随着科学技术的发展,用户在使用终端设备的进行人机交互的过程中,全场景沉浸式体验已经成为终端设备的发展趋势。With the development of science and technology, the full-scene immersive experience has become the development trend of terminal equipment in the process of user-computer interaction using terminal equipment.
目前,全场景沉浸式体验的应用场景有多种,例如用户与电脑、电视、智慧屏(或称为大屏)等带有显示屏的设备进行人机交互的应用场景。At present, there are various application scenarios for full-scene immersive experience, such as the application scenario in which a user performs human-computer interaction with a device with a display screen such as a computer, a TV, and a smart screen (or called a large screen).
然而,传统的使用遥控器或者鼠标等设备进行控制以达到人机交互的方式,已经无法满足当下的需求。However, the traditional way of using a remote controller or a mouse and other devices to control to achieve human-computer interaction has been unable to meet the current needs.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种人机交互方法及相关设备,用于通过异步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。Embodiments of the present application provide a human-computer interaction method and related equipment for controlling a cursor in a display device through asynchronous calibration, which can improve the continuity of cursor movement in the display device, thereby improving user experience.
本申请实施例第一方面提供了一种人机交互方法,该方法可以应用于包括运动传感器、摄像头、处理器和显示屏的人机交互系统中,所述方法包括:所述运动传感器获取第一时间段内第一采样频率的初始运动传感数据,所述初始运动传感数据是由用户肢体动作触发的;所述摄像头获取所述第一时间段内第二采样频率的第一图像数据,所述第二采样频率小于所述第一采样频率,所述第一图像数据包括所述用户肢体动作信息;此后,所述处理器获取第一约束条件,所述第一约束条件是通过对所述第一图像数据进行计算机视觉CV处理得到的;并且,所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据;进一步地,所述处理器根据所述目标运动传感数据得到控制信息,所述控制信息用于控制所述显示屏。A first aspect of the embodiments of the present application provides a human-computer interaction method, and the method can be applied to a human-computer interaction system including a motion sensor, a camera, a processor, and a display screen, and the method includes: the motion sensor acquires a first The initial motion sensing data of the first sampling frequency within a period of time, the initial motion sensing data is triggered by the user's limb movements; the camera acquires the first image data of the second sampling frequency within the first period of time , the second sampling frequency is less than the first sampling frequency, and the first image data includes the user's limb movement information; after that, the processor obtains a first constraint condition, the first constraint The first image data is obtained through computer vision CV processing; and the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data; further, the processor Control information is obtained according to the target motion sensing data, and the control information is used to control the display screen.
基于上述技术方案,该摄像头在第一时间段内基于第二采样频率获取得到的第一图像数据之后,通过CV关键点识别得到第一约束条件,处理器基于该第一约束条件对该第一时间段内基于第一采样频率获取得到的初始运动传感数据进行校准,得到目标运动传感数据,此后,处理器再进一步根据该目标运动传感数据得到用于控制显示屏的控制信息。其中,该第二采样频率小于第一采样频率,即处理器对初始运动传感数据进行异步校准得到目标运动传感数据。其中,受限于硬件计算能力限制,CV识别过程的计算时长一般远远大于IMU数据的处理时长,该异步校准的实现方式相较于实时同步校准的人机交互方式,由于无需等待较长计算时长的CV处理过程,可以有效避免显示卡顿、显示延迟等问题,使得通过异 步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。Based on the above technical solution, after the camera obtains the first image data based on the second sampling frequency in the first time period, the camera obtains the first constraint condition through CV key point recognition, and the processor determines the first constraint condition based on the first constraint condition. The initial motion sensing data obtained based on the first sampling frequency is calibrated within the time period to obtain target motion sensing data, and thereafter, the processor further obtains control information for controlling the display screen according to the target motion sensing data. The second sampling frequency is lower than the first sampling frequency, that is, the processor performs asynchronous calibration on the initial motion sensing data to obtain target motion sensing data. Among them, due to the limitation of hardware computing capacity, the calculation time of the CV identification process is generally much longer than the processing time of the IMU data. Compared with the human-computer interaction method of real-time synchronous calibration, the asynchronous calibration implementation method does not need to wait for a long calculation time. The long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
需要说明的是,该处理器可以与运动传感器设置于同一个电子设备,或者,该处理器可以与摄像头设置于同一个电子设备,或者,该处理器可以与显示屏设置于同一电子设备,此处不做限定。It should be noted that the processor and the motion sensor may be installed in the same electronic device, or the processor and the camera may be installed in the same electronic device, or the processor and the display screen may be installed in the same electronic device. There are no restrictions.
在第一方面的一种可能实现方式中,所述第一约束条件是通过对所述第一图像数据进行计算机视觉CV处理中的人体骨架关键点识别得到的,所述第一约束条件包括三维空间方向角信息。In a possible implementation manner of the first aspect, the first constraint condition is obtained by recognizing human skeleton key points in computer vision CV processing on the first image data, and the first constraint condition includes three-dimensional Spatial orientation angle information.
可选地,该CV处理可以是基于三维人体骨架识别技术实现,也可以是基于二维人体骨架识别技术实现,此处不做限定。Optionally, the CV processing may be implemented based on a three-dimensional human skeleton recognition technology, or may be implemented based on a two-dimensional human skeleton recognition technology, which is not limited here.
基于上述技术方案,用于对初始运动传感数据进行异步校准的第一约束条件可以是CV识别处理得到的三维空间方向角信息。Based on the above technical solution, the first constraint condition for asynchronous calibration of the initial motion sensing data may be three-dimensional space orientation angle information obtained by CV identification processing.
在第一方面的一种可能实现方式中,所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据,具体包括:所述处理器首先根据所述第一约束条件映射得到校准数据,并基于所述校准数据拟合得到第一曲线;然后,所述处理器根据所述初始运动传感数据拟合得到第二曲线,并对所述第一曲线和所述第二曲线进行加权平均处理,得到第三曲线;并且,所述处理器在所述第三曲线中确定校准后的运动传感数据;此后,所述处理器根据姿态解算算法处理所述校准后的运动传感数据,得到所述目标运动传感数据。In a possible implementation manner of the first aspect, the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data, which specifically includes: the processor first according to the The first constraint condition is mapped to obtain calibration data, and a first curve is obtained by fitting based on the calibration data; then, the processor is fitted to obtain a second curve according to the initial motion sensing data, and the first curve is fitted by the processor. Perform weighted average processing with the second curve to obtain a third curve; and, the processor determines the calibrated motion sensing data in the third curve; after that, the processor processes according to the attitude calculation algorithm The calibrated motion sensing data is used to obtain the target motion sensing data.
基于上述技术方案,目标运动传感数据可以为经过姿态解算算法处理后得到的数据,其中,第一约束条件对初始运动传感数据进行异步校准的过程中,可以首先对该初始运动传感数据进行校准,然后再经过该姿态解算算法处理,得到该目标运动传感数据。Based on the above technical solution, the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm. In the process of asynchronously calibrating the initial motion sensing data with the first constraint, the initial motion sensing data may be firstly calibrated. The data is calibrated, and then processed by the attitude calculation algorithm to obtain the target motion sensing data.
在第一方面的一种可能实现方式中,所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据,具体包括:所述处理器首先根据姿态结算算法处理所述初始运动传感数据,得到第一姿态角数据;然后,所述处理器根据所述第一姿态角数据拟合得到第四曲线,并根据所述第一约束条件拟合得到第五曲线;此后,所述处理器对所述第四曲线和所述第五曲线进行加权平均处理,得到第六曲线;进一步地,所述处理器在所述第六曲线中确定所述目标运动传感数据。In a possible implementation manner of the first aspect, the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data, which specifically includes: the processor first settles a settlement according to the posture The algorithm processes the initial motion sensing data to obtain first attitude angle data; then, the processor obtains a fourth curve by fitting according to the first attitude angle data, and obtains a fourth curve by fitting according to the first constraint condition Five curves; after that, the processor performs weighted average processing on the fourth curve and the fifth curve to obtain a sixth curve; further, the processor determines the target movement in the sixth curve sensor data.
基于上述技术方案,目标运动传感数据可以为经过姿态解算算法处理后得到的数据,其中,第一约束条件对初始运动传感数据进行异步校准的过程中,可以首先对该初始运动传感数据经过姿态解算算法处理,得到处理结果之后,再基于第一约束条件对该处理结果进行校准,得到该目标运动传感数据。Based on the above technical solution, the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm. In the process of asynchronously calibrating the initial motion sensing data with the first constraint, the initial motion sensing data may be firstly calibrated. The data is processed by an attitude calculation algorithm, and after the processing result is obtained, the processing result is calibrated based on the first constraint condition to obtain the target motion sensing data.
在第一方面的一种可能实现方式中,所述控制信息为对所述目标运动传感数据进行坐标转换得到的坐标数据,所述坐标数据用于控制所述显示屏中光标的显示位置,或者,所述控制信息为对所述目标运动传感数据进行映射得到的手势标识结果,所述手势标识结果用于操作所述显示屏的界面元素。In a possible implementation manner of the first aspect, the control information is coordinate data obtained by performing coordinate transformation on the target motion sensing data, and the coordinate data is used to control the display position of the cursor in the display screen, Alternatively, the control information is a gesture identification result obtained by mapping the target motion sensing data, and the gesture identification result is used to operate an interface element of the display screen.
基于上述技术方案,对异步校准得到的目标运动传感数据进行处理得到的用于控制显 示屏的控制信息,可以对显示屏执行多种不同的操作,例如控制显示屏中光标的显示位置、控制显示屏中的界面元素,例如选中、放大、拖动、点击等。Based on the above technical solution, the control information for controlling the display screen obtained by processing the target motion sensing data obtained by asynchronous calibration can perform various operations on the display screen, such as controlling the display position of the cursor in the display screen, controlling Interface elements in the display, such as selection, zoom, drag, click, etc.
在第一方面的一种可能实现方式中,在所述处理器根据所述第一约束条件校准所述初始运动传感数据之前,所述方法还包括:所述处理器根据时间差对齐所述第一约束条件和所述初始运动传感数据。In a possible implementation manner of the first aspect, before the processor calibrates the initial motion sensing data according to the first constraint condition, the method further includes: the processor aligns the first motion sensing data according to a time difference a constraint condition and the initial motion sensing data.
基于上述技术方案,由于硬件上固有的差异,不同设备(例如包含运动传感器的设备、包含摄像头的设备等)之间难以避免地存在时间差,这种时间差有可能造成显示屏中显示的光标错位或者轨迹不准确。因此,为了消除这种客观存在的时间差的不利影响,可以通过确定出来的时间差对齐所述第一约束条件和所述初始运动传感数据,以消除该时间差所带来的影响。Based on the above technical solutions, due to inherent differences in hardware, there is inevitably a time difference between different devices (such as a device containing a motion sensor, a device containing a camera, etc.), which may cause the cursor displayed on the display screen to be misplaced or The trajectory is not accurate. Therefore, in order to eliminate the adverse effect of the objectively existing time difference, the first constraint condition and the initial motion sensing data may be aligned through the determined time difference, so as to eliminate the effect of the time difference.
在第一方面的一种可能实现方式中,所述时间差是通过所述第一时间段之前的初始化过程计算得到的,所述方法还包括:所述显示屏显示第一提示信息,用于提示用户作出指定肢体动作;此后,所述运动传感器获取所述初始化过程中的运动传感数据,所述初始化过程中的运动传感数据是由所述用户作出的所述指定肢体动作触发的;并且,所述摄像头获取所述初始化过程中的图像数据,所述初始化过程中的图像数据包括所述用户作出的所述指定肢体动作信息;进一步地,所述处理器根据所述初始化过程中的运动传感数据的信号特征与所述初始化过程中的图像数据的信号特征确定所述时间差。In a possible implementation manner of the first aspect, the time difference is calculated through an initialization process before the first time period, and the method further includes: displaying, on the display screen, first prompt information for prompting The user makes a designated limb movement; thereafter, the motion sensor acquires motion sensing data in the initialization process, the motion sensing data in the initialization process being triggered by the designated limb movement made by the user; and , the camera acquires the image data in the initialization process, and the image data in the initialization process includes the specified limb movement information made by the user; further, the processor is based on the movement in the initialization process. The time difference is determined by the signal characteristics of the sensory data and the signal characteristics of the image data during the initialization process.
基于上述技术方案,初始化过程中,在显示屏中对用户进行提示以使得用户执行特定肢体动作,并且在用户执行特定肢体动作的过程中,通过运动传感器获取该初始化过程中的运动传感数据,并通过摄像头获取该初始化过程中的图像数据,处理器再进一步基于该运动传感数据和图像数据进行处理,以确定时间差。Based on the above technical solution, during the initialization process, the user is prompted on the display screen to make the user perform a specific limb action, and during the process of the user performing the specific limb action, the motion sensing data in the initialization process is obtained through a motion sensor, The image data in the initialization process is acquired through the camera, and the processor further processes based on the motion sensing data and the image data to determine the time difference.
在第一方面的一种可能实现方式中,所述方法还包括:所述处理器根据所述初始化过程中的图像数据确定所述用户与所述摄像头之间的初始相对信息。In a possible implementation manner of the first aspect, the method further includes: the processor determining initial relative information between the user and the camera according to the image data in the initialization process.
可选地,该初始相对信息可以包括距离、朝向等。Optionally, the initial relative information may include distance, orientation, and the like.
在第一方面的一种可能实现方式中,所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据包括:所述处理器根据所述初始相对信息确定初始人体手臂工学模型,所述初始人体手臂工学模型包括至少一个肢体移动角度的第一值域;然后,所述处理器根据所述初始人体手臂工学模型更新所述第一约束条件,得到更新后的第一约束条件;此后,所述处理器根据所述更新后的第一约束条件校准所述初始运动传感数据,得到所述目标运动传感数据。In a possible implementation manner of the first aspect, the processor calibrates the initial motion sensing data according to the first constraint condition, and obtaining the target motion sensing data includes: the processor according to the initial relative information Determine an initial human arm engineering model, the initial human arm engineering model includes at least a first range of limb movement angles; then, the processor updates the first constraint condition according to the initial human arm engineering model to obtain an updated After that, the processor calibrates the initial motion sensing data according to the updated first constraint condition to obtain the target motion sensing data.
基于上述技术方案,用户在进行人机交互的过程中,用户一般不会做出违背人体工学的肢体动作,因此,可以通过用户与显示屏之间的相对信息所构建的人体手臂工学模型对第一约束条件进行更新,即基于该人体手臂工学模型对该第一约束条件进一步约束,以避免光标不准、光标溢出问题。Based on the above technical solutions, in the process of human-computer interaction, the user generally does not make physical movements that violate ergonomics. Therefore, the human arm engineering model constructed by the relative information between the user and the display screen can be used for A constraint condition is updated, that is, the first constraint condition is further constrained based on the ergonomic model of the human arm, so as to avoid the problems of inaccurate cursor and cursor overflow.
在第一方面的一种可能实现方式中,所述处理器根据所述初始人体手臂工学模型更新所述第一约束条件,得到更新后的第一约束条件的过程具体可以包括:所述处理器根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;然后,在所述第一相对信息 不同于所述初始相对信息时,所述处理器根据所述第一相对信息更新所述初始人体手臂工学模型,得到第一人体手臂工学模型;此后,所述处理器根据所述第一人体手臂工学模型更新所述第一约束条件,得到所述更新后的第一约束条件。In a possible implementation manner of the first aspect, the processor updates the first constraint condition according to the initial ergonomic model of the human arm, and the process of obtaining the updated first constraint condition may specifically include: the processor: Determine the first relative information between the user and the camera according to the first image data; then, when the first relative information is different from the initial relative information, the processor determines the first relative information according to the first relative information updating the initial human arm engineering model to obtain a first human arm engineering model; after that, the processor updates the first constraint condition according to the first human arm engineering model to obtain the updated first constraint condition .
基于上述技术方案,用户与摄像头之间的相对信息(例如距离、朝向等)发生改变时,基于用户与摄像头之间的相对信息所构建的人体手臂工学模型也可以基于不同的相对信息进行更新,并使用更新后的人体手臂工学模型对第一约束条件进一步约束,以确保控制信息控制显示屏的时效性。Based on the above technical solutions, when the relative information (such as distance, orientation, etc.) between the user and the camera changes, the ergonomic arm model constructed based on the relative information between the user and the camera can also be updated based on different relative information. And use the updated human arm engineering model to further constrain the first constraint condition to ensure the timeliness of the control information to control the display screen.
在第一方面的一种可能实现方式中,所述处理器根据所述目标运动传感数据得到控制信息包括:首先,所述处理器根据所述初始相对信息确定用户在所述显示装置中的初始映射关系;然后,所述处理器根据所述初始映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息。In a possible implementation manner of the first aspect, obtaining, by the processor, the control information according to the target motion sensing data includes: first, the processor determines, according to the initial relative information, a user's status in the display device initial mapping relationship; then, the processor performs coordinate transformation processing on the target motion sensing data according to the initial mapping relationship to obtain the control information.
基于上述技术方案,由于用于对光标控制的过程中,用户有可能会发生移动,导致用户与摄像头之间的相对信息有可能发生改变。因此,可以根据初始化过程中所确定的初始相对信息,进一步确定用户在该显示屏中的初始映射关系,并将该初始映射关系作为控制信息的处理依据,避免该相对信息改变时所产生的定位不准、光标溢出等问题。Based on the above technical solution, since the user may move during the process of controlling the cursor, the relative information between the user and the camera may change. Therefore, the initial mapping relationship of the user in the display screen can be further determined according to the initial relative information determined in the initialization process, and the initial mapping relationship can be used as the processing basis of the control information, so as to avoid the positioning generated when the relative information is changed. Inaccurate, cursor overflow and other issues.
在第一方面的一种可能实现方式中,所述处理器根据所述初始映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息包括:首先,所述处理器根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;然后,在所述第一相对信息不同于所述初始相对信息时,所述处理器根据所述第一相对信息更新所述初始映射关系,得到第一映射关系;此后,所述处理器根据所述第一映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息。In a possible implementation manner of the first aspect, the processor performs coordinate transformation processing on the target motion sensing data according to the initial mapping relationship, and obtaining the control information includes: The first image data determines first relative information between the user and the camera; then, when the first relative information is different from the initial relative information, the processor updates the first relative information according to the first relative information. The initial mapping relationship is obtained to obtain a first mapping relationship; after that, the processor performs coordinate transformation processing on the target motion sensing data according to the first mapping relationship to obtain the control information.
基于上述技术方案,用户与摄像头之间的相对信息(例如距离、朝向等)发生改变时,基于用户与摄像头之间的相对信息所确定的初始映射关系也可以基于不同的相对信息进行更新,并使用更新后的映射关系对对所述目标运动传感数据进行坐标转换处理,得到控制信息,以确保控制信息控制显示屏的时效性。Based on the above technical solution, when the relative information (such as distance, orientation, etc.) between the user and the camera changes, the initial mapping relationship determined based on the relative information between the user and the camera can also be updated based on different relative information, and Using the updated mapping relationship to perform coordinate transformation processing on the target motion sensing data to obtain control information, so as to ensure the timeliness of the control information to control the display screen.
在第一方面的一种可能实现方式中,所述运动传感器包括加速度计、陀螺仪、磁力计中的一个至多个传感器的传感单元。In a possible implementation manner of the first aspect, the motion sensor includes a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
基于上述技术方案,该运动传感器可以为IMU数据采集装置,其中,该IMU数据采集装置可以包括加速度计、陀螺仪、磁力计中的一个至多个传感器的传感单元。Based on the above technical solution, the motion sensor may be an IMU data acquisition device, wherein the IMU data acquisition device may include a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
在第一方面的一种可能实现方式中,所述摄像头包括深度摄像头、非深度摄像头中的一种至多种摄像头。In a possible implementation manner of the first aspect, the camera includes one or more cameras selected from a depth camera and a non-depth camera.
基于上述技术方案,该摄像头可以包括多种不同的实现,例如深度摄像头、非深度摄像头等,以使得方案适配于不同的应用场景。Based on the above technical solutions, the camera may include various implementations, such as a depth camera, a non-depth camera, etc., so that the solution is suitable for different application scenarios.
本申请实施例第二方面提供了一种第一电子设备,包括运动传感器和处理器,其中,所述运动传感器用于获取第一时间段内第一采样频率的初始运动传感数据,所述初始运动传感数据是由用户肢体动作触发的;此外,所述处理器用于根据获取的第一约束条件,校准所述初始运动传感数据,得到目标运动传感数据,其中,所述第一约束条件是通过对摄 像头获取的所述第一时间段内第二采样频率的第一图像数据进行计算机视觉CV处理得到的,所述第二采样频率小于所述第一采样频率,所述第一图像数据包括所述用户肢体动作信息;进一步地,所述处理器还用于根据所述目标运动传感数据得到控制信息,所述控制信息用于控制显示屏的显示内容;其中,所述摄像头和所述显示屏包含于不同于所述第一电子设备的第二电子设备。A second aspect of the embodiments of the present application provides a first electronic device, including a motion sensor and a processor, wherein the motion sensor is used to acquire initial motion sensing data of a first sampling frequency in a first time period, and the The initial motion sensing data is triggered by the user's limb movements; in addition, the processor is configured to calibrate the initial motion sensing data according to the acquired first constraint condition to obtain target motion sensing data, wherein the first The constraint condition is obtained by performing computer vision CV processing on the first image data of the second sampling frequency obtained by the camera in the first time period, the second sampling frequency is less than the first sampling frequency, and the first sampling frequency is smaller than the first sampling frequency. The image data includes the user's limb movement information; further, the processor is further configured to obtain control information according to the target motion sensing data, where the control information is used to control the display content of the display screen; wherein, the camera and the display screen is contained in a second electronic device different from the first electronic device.
基于上述技术方案,在第一电子设备中,通过运动传感器获取第一时间段内第一采样频率的初始运动传感数据,并且,该第一电子设备还获取通过对摄像头获取的所述第一时间段内第二采样频率的第一图像数据进行计算机视觉CV处理得到的第一约束条件,此后,第一电子设备中的处理器再基于该第一约束条件对该第一时间段内基于第一采样频率获取得到的初始运动传感数据进行校准,得到目标运动传感数据,此后,处理器再进一步根据该目标运动传感数据得到用于控制显示屏的控制信息。其中,该第二采样频率小于第一采样频率,即处理器对初始运动传感数据进行异步校准得到目标运动传感数据。其中,受限于硬件计算能力限制,CV识别过程的计算时长一般远远大于IMU数据的处理时长,该异步校准的实现方式相较于实时同步校准的人机交互方式,由于无需等待较长计算时长的CV处理过程,可以有效避免显示卡顿、显示延迟等问题,使得通过异步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。Based on the above technical solution, in the first electronic device, the initial motion sensing data of the first sampling frequency in the first time period is obtained by the motion sensor, and the first electronic device also obtains the first The first constraint condition obtained by performing computer vision CV processing on the first image data of the second sampling frequency in the time period, after that, the processor in the first electronic device further based on the first constraint condition The initial motion sensing data obtained at a sampling frequency is calibrated to obtain target motion sensing data, after which the processor further obtains control information for controlling the display screen according to the target motion sensing data. The second sampling frequency is lower than the first sampling frequency, that is, the processor performs asynchronous calibration on the initial motion sensing data to obtain target motion sensing data. Among them, due to the limitation of hardware computing capacity, the calculation time of the CV identification process is generally much longer than the processing time of the IMU data. Compared with the human-computer interaction method of real-time synchronous calibration, the asynchronous calibration implementation method does not need to wait for a long calculation time. The long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
在第二方面的一种可能实现方式中,所述处理器,具体用于:根据所述第一约束条件映射得到校准数据,并基于所述校准数据拟合得到第一曲线;然后,根据所述初始运动传感数据拟合得到第二曲线,并对所述第一曲线和所述第二曲线进行加权平均处理,得到第三曲线;并且,在所述第三曲线中确定校准后的运动传感数据,进一步地,根据姿态解算算法处理所述校准后的运动传感数据,得到所述目标运动传感数据。In a possible implementation manner of the second aspect, the processor is specifically configured to: map and obtain calibration data according to the first constraint condition, and obtain a first curve by fitting based on the calibration data; and then, according to the Fitting the initial motion sensing data to obtain a second curve, and performing weighted average processing on the first curve and the second curve to obtain a third curve; and determining the calibrated motion in the third curve Sensing data, further processing the calibrated motion sensing data according to an attitude calculation algorithm to obtain the target motion sensing data.
基于上述技术方案,目标运动传感数据可以为经过姿态解算算法处理后得到的数据,其中,第一约束条件对初始运动传感数据进行异步校准的过程中,可以首先对该初始运动传感数据进行校准,然后再经过该姿态解算算法处理,得到该目标运动传感数据。Based on the above technical solution, the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm. In the process of asynchronously calibrating the initial motion sensing data with the first constraint, the initial motion sensing data may be firstly calibrated. The data is calibrated, and then processed by the attitude calculation algorithm to obtain the target motion sensing data.
在第二方面的一种可能实现方式中,所述处理器,具体用于:根据姿态结算算法处理所述初始运动传感数据,得到第一姿态角数据;然后,根据所述第一姿态角数据拟合得到第四曲线,并根据所述第一约束条件拟合得到第五曲线,并且,对所述第四曲线和所述第五曲线进行加权平均处理,得到第六曲线;此后,在所述第六曲线中确定所述目标运动传感数据。In a possible implementation manner of the second aspect, the processor is specifically configured to: process the initial motion sensing data according to an attitude settlement algorithm to obtain first attitude angle data; and then, according to the first attitude angle A fourth curve is obtained by data fitting, and a fifth curve is obtained by fitting according to the first constraint condition, and a weighted average process is performed on the fourth curve and the fifth curve to obtain a sixth curve; The target motion sensing data is determined in the sixth curve.
基于上述技术方案,目标运动传感数据可以为经过姿态解算算法处理后得到的数据,其中,第一约束条件对初始运动传感数据进行异步校准的过程中,可以首先对该初始运动传感数据经过姿态解算算法处理,得到处理结果之后,再基于第一约束条件对该处理结果进行校准,得到该目标运动传感数据。Based on the above technical solution, the target motion sensing data may be data obtained after being processed by an attitude calculation algorithm. In the process of asynchronously calibrating the initial motion sensing data with the first constraint, the initial motion sensing data may be firstly calibrated. The data is processed by an attitude calculation algorithm, and after the processing result is obtained, the processing result is calibrated based on the first constraint condition to obtain the target motion sensing data.
在第二方面的一种可能实现方式中,所述处理器,还用于:根据时间差对齐所述第一约束条件和所述初始运动传感数据。In a possible implementation manner of the second aspect, the processor is further configured to: align the first constraint condition and the initial motion sensing data according to a time difference.
基于上述技术方案,由于硬件上固有的差异,不同设备(例如包含运动传感器的设备、包含摄像头的设备等)之间难以避免地存在时间差,这种时间差有可能造成显示屏中显示 的光标错位或者轨迹不准确。因此,为了消除这种客观存在的时间差的不利影响,可以通过确定出来的时间差对齐所述第一约束条件和所述初始运动传感数据,以消除该时间差所带来的影响。Based on the above technical solutions, due to inherent differences in hardware, there is inevitably a time difference between different devices (such as a device containing a motion sensor, a device containing a camera, etc.), which may cause the cursor displayed on the display screen to be misplaced or The trajectory is not accurate. Therefore, in order to eliminate the adverse effect of the objectively existing time difference, the first constraint condition and the initial motion sensing data may be aligned through the determined time difference, so as to eliminate the effect of the time difference.
在第二方面的一种可能实现方式中,所述时间差是通过所述第一时间段之前的初始化过程计算得到的;所述运动传感器,还用于获取所述初始化过程中的运动传感数据,所述初始化过程中的运动传感数据是由所述用户作出的指定肢体动作触发的;此外,所述处理器,还用于根据所述初始化过程中的运动传感数据的信号特征与所述初始化过程中的图像数据的信号特征确定所述时间差,其中,所述初始化过程中的图像数据为所述摄像头在所述初始化过程中获取的,所述初始化过程中的图像数据包括所述用户作出的所述指定肢体动作信息。In a possible implementation manner of the second aspect, the time difference is calculated through an initialization process before the first time period; the motion sensor is further configured to acquire motion sensing data in the initialization process , the motion sensing data in the initialization process is triggered by the specified limb movements made by the user; in addition, the processor is also used for signal characteristics of the motion sensing data in the initialization process and all The time difference is determined by the signal characteristics of the image data in the initialization process, wherein the image data in the initialization process is acquired by the camera in the initialization process, and the image data in the initialization process includes the user The specified body motion information made.
基于上述技术方案,初始化过程中,在显示屏中对用户进行提示以使得用户执行特定肢体动作,并且在用户执行特定肢体动作的过程中,通过运动传感器获取该初始化过程中的运动传感数据,并通过摄像头获取该初始化过程中的图像数据,处理器再进一步基于该运动传感数据和图像数据进行处理,以确定时间差。Based on the above technical solution, during the initialization process, the user is prompted on the display screen to make the user perform a specific limb action, and during the process of the user performing the specific limb action, the motion sensing data in the initialization process is obtained through a motion sensor, The image data in the initialization process is acquired through the camera, and the processor further processes based on the motion sensing data and the image data to determine the time difference.
在第二方面的一种可能实现方式中,所述处理器,还用于:根据所述初始化过程中的图像数据确定所述用户与所述摄像头之间的初始相对信息。In a possible implementation manner of the second aspect, the processor is further configured to: determine initial relative information between the user and the camera according to the image data in the initialization process.
可选地,该初始相对信息可以包括距离、朝向等。Optionally, the initial relative information may include distance, orientation, and the like.
在第二方面的一种可能实现方式中,所述处理器,具体用于:根据所述初始相对信息确定初始人体手臂工学模型,所述初始人体手臂工学模型包括至少一个肢体移动角度的第一值域;然后,根据所述初始人体手臂工学模型更新所述第一约束条件,得到更新后的第一约束条件;此后,根据所述更新后的第一约束条件校准所述初始运动传感数据,得到所述目标运动传感数据。In a possible implementation manner of the second aspect, the processor is specifically configured to: determine an initial ergonomic model of the human arm according to the initial relative information, where the initial ergonomic model of the human arm includes at least a first movement angle of a limb. value range; then, update the first constraint condition according to the initial human arm engineering model to obtain the updated first constraint condition; thereafter, calibrate the initial motion sensing data according to the updated first constraint condition to obtain the target motion sensing data.
基于上述技术方案,用户在进行人机交互的过程中,用户一般不会做出违背人体工学的肢体动作,因此,可以通过用户与显示屏之间的相对信息所构建的人体手臂工学模型对第一约束条件进行更新,即基于该人体手臂工学模型对该第一约束条件进一步约束,以避免光标不准、光标溢出问题。Based on the above technical solutions, in the process of human-computer interaction, the user generally does not make physical movements that violate ergonomics. Therefore, the human arm engineering model constructed by the relative information between the user and the display screen can be used for A constraint condition is updated, that is, the first constraint condition is further constrained based on the ergonomic model of the human arm, so as to avoid the problems of inaccurate cursor and cursor overflow.
在第二方面的一种可能实现方式中,所述处理器,具体用于:根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;然后,在所述第一相对信息不同于所述初始相对信息时,根据所述第一相对信息更新所述初始人体手臂工学模型,得到第一人体手臂工学模型;进一步地,所述处理器根据所述第一人体手臂工学模型更新所述第一约束条件,得到所述更新后的第一约束条件。In a possible implementation manner of the second aspect, the processor is specifically configured to: determine first relative information between the user and the camera according to the first image data; and then, in the first relative information When the information is different from the initial relative information, the initial human arm engineering model is updated according to the first relative information to obtain a first human arm engineering model; further, the processor is based on the first human arm engineering model. The first constraint condition is updated to obtain the updated first constraint condition.
基于上述技术方案,用户与摄像头之间的相对信息(例如距离、朝向等)发生改变时,基于用户与摄像头之间的相对信息所构建的人体手臂工学模型也可以基于不同的相对信息进行更新,并使用更新后的人体手臂工学模型对第一约束条件进一步约束,以确保控制信息控制显示屏的时效性。Based on the above technical solutions, when the relative information (such as distance, orientation, etc.) between the user and the camera changes, the ergonomic arm model constructed based on the relative information between the user and the camera can also be updated based on different relative information. And use the updated human arm engineering model to further constrain the first constraint condition to ensure the timeliness of the control information to control the display screen.
在第二方面的一种可能实现方式中,所述处理器,还用于:首先根据所述初始相对信息确定用户在所述显示装置中的初始映射关系;然后,根据所述初始映射关系对所述目标 运动传感数据进行坐标转换处理,得到所述控制信息。In a possible implementation manner of the second aspect, the processor is further configured to: first determine the initial mapping relationship of the user in the display device according to the initial relative information; The target motion sensing data is subjected to coordinate conversion processing to obtain the control information.
基于上述技术方案,由于用于对光标控制的过程中,用户有可能会发生移动,导致用户与摄像头之间的相对信息有可能发生改变。因此,可以根据初始化过程中所确定的初始相对信息,进一步确定用户在该显示屏中的初始映射关系,并将该初始映射关系作为控制信息的处理依据,避免该相对信息改变时所产生的定位不准、光标溢出等问题。Based on the above technical solution, since the user may move during the process of controlling the cursor, the relative information between the user and the camera may change. Therefore, the initial mapping relationship of the user in the display screen can be further determined according to the initial relative information determined in the initialization process, and the initial mapping relationship can be used as the processing basis of the control information, so as to avoid the positioning generated when the relative information is changed. Inaccurate, cursor overflow and other issues.
在第二方面的一种可能实现方式中,所述处理器,具体用于:根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;然后,在所述第一相对信息不同于所述初始相对信息时,根据所述第一相对信息更新所述初始映射关系,得到第一映射关系;进一步地,根据所述第一映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息。In a possible implementation manner of the second aspect, the processor is specifically configured to: determine first relative information between the user and the camera according to the first image data; and then, in the first relative information When the information is different from the initial relative information, update the initial mapping relationship according to the first relative information to obtain a first mapping relationship; further, coordinate the target motion sensing data according to the first mapping relationship The conversion process is performed to obtain the control information.
基于上述技术方案,用户与摄像头之间的相对信息(例如距离、朝向等)发生改变时,基于用户与摄像头之间的相对信息所确定的初始映射关系也可以基于不同的相对信息进行更新,并使用更新后的映射关系对对所述目标运动传感数据进行坐标转换处理,得到控制信息,以确保控制信息控制显示屏的时效性。Based on the above technical solution, when the relative information (such as distance, orientation, etc.) between the user and the camera changes, the initial mapping relationship determined based on the relative information between the user and the camera can also be updated based on different relative information, and Using the updated mapping relationship to perform coordinate transformation processing on the target motion sensing data to obtain control information, so as to ensure the timeliness of the control information to control the display screen.
在第二方面的一种可能实现方式中所述运动传感器包括加速度计、陀螺仪、磁力计中的一个至多个传感器的传感单元。In a possible implementation manner of the second aspect, the motion sensor includes one or more sensing units of an accelerometer, a gyroscope, and a magnetometer.
基于上述技术方案,该运动传感器可以为IMU数据采集装置,其中,该IMU数据采集装置可以包括加速度计、陀螺仪、磁力计中的一个至多个传感器的传感单元。Based on the above technical solution, the motion sensor may be an IMU data acquisition device, wherein the IMU data acquisition device may include a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
需要说明的是,第二方面中第一电子设备所包含的运动传感器和处理器还可以执行第一方面及其任一可能的实现方式中的实现过程,并实现对应的有益效果,此处不一一赘述。It should be noted that the motion sensor and the processor included in the first electronic device in the second aspect can also perform the implementation process in the first aspect and any possible implementation manner thereof, and achieve corresponding beneficial effects, which are not described here. Repeat them one by one.
本申请实施例第三方面提供了一种第二电子设备,包括摄像头和显示屏;其中,所述摄像头用于获取第一时间段内第二采样频率的第一图像数据,所述第一图像数据包括用户肢体动作信息;其中,所述第一图像数据用于确定第一约束条件,且所述第一约束条件用于校准初始运动传感数据得到目标运动传感数据,且所述初始运动传感数据为第一电子设备中的运动传感器在所述第一时间段内基于第一采样频率采样得到,所述初始运动传感数据由所述用户肢体动作触发;所述第二采样频率小于所述第一采样频率;此后,所述显示屏用于显示控制信息,其中,所述控制信息基于所述目标运动传感数据得到。A third aspect of the embodiments of the present application provides a second electronic device, including a camera and a display screen; wherein the camera is used to acquire first image data of a second sampling frequency in a first time period, and the first image The data includes user limb movement information; wherein the first image data is used to determine a first constraint condition, and the first constraint condition is used to calibrate initial motion sensing data to obtain target motion sensing data, and the initial motion The sensing data is obtained by sampling the motion sensor in the first electronic device based on a first sampling frequency within the first period of time, and the initial motion sensing data is triggered by the user's limb movement; the second sampling frequency is less than the first sampling frequency; thereafter, the display screen is used to display control information, wherein the control information is obtained based on the target motion sensing data.
基于上述技术方案,在第二电子设备中,摄像头获取的所述第一时间段内第二采样频率的第一图像数据,该第一图像数据用于确定第一约束条件,并且,该第一约束条件可以对该第一时间段内基于第一采样频率获取得到的初始运动传感数据进行校准,得到目标运动传感数据,此后,再进一步根据该目标运动传感数据得到用于控制显示屏的控制信息,以使得显示屏显示该控制信息。其中,该第二采样频率小于第一采样频率,即对初始运动传感数据进行异步校准得到目标运动传感数据。其中,受限于硬件计算能力限制,CV识别过程的计算时长一般远远大于IMU数据的处理时长,该异步校准的实现方式相较于实时同步校准的人机交互方式,由于无需等待较长计算时长的CV处理过程,可以有效避免显示卡顿、显示延迟等问题,使得通过异步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。Based on the above technical solution, in the second electronic device, the first image data of the second sampling frequency obtained by the camera in the first time period is used to determine the first constraint condition, and the first image data is used to determine the first constraint condition. The constraint condition may be to calibrate the initial motion sensing data obtained based on the first sampling frequency within the first time period to obtain target motion sensing data, and then further obtain the target motion sensing data for controlling the display screen according to the target motion sensing data. control information, so that the display screen displays the control information. Wherein, the second sampling frequency is lower than the first sampling frequency, that is, the target motion sensing data is obtained by asynchronously calibrating the initial motion sensing data. Among them, due to the limitation of hardware computing capacity, the calculation time of the CV identification process is generally much longer than the processing time of the IMU data. Compared with the human-computer interaction method of real-time synchronous calibration, the asynchronous calibration implementation method does not need to wait for a long calculation time. The long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
在第三方面的一种可能实现方式中,所述第一约束条件是通过对所述第一图像数据进行计算机视觉CV处理中的人体骨架关键点识别得到的,所述第一约束条件包括三维空间方向角信息。In a possible implementation manner of the third aspect, the first constraint condition is obtained by recognizing human skeleton key points in computer vision CV processing on the first image data, and the first constraint condition includes three-dimensional Spatial orientation angle information.
可选地,该CV处理可以是基于三维人体骨架识别技术实现,也可以是基于二维人体骨架识别技术实现,此处不做限定。Optionally, the CV processing may be implemented based on a three-dimensional human skeleton recognition technology, or may be implemented based on a two-dimensional human skeleton recognition technology, which is not limited here.
基于上述技术方案,用于对初始运动传感数据进行异步校准的第一约束条件可以是CV识别处理得到的三维空间方向角信息。Based on the above technical solution, the first constraint condition for asynchronous calibration of the initial motion sensing data may be three-dimensional space orientation angle information obtained by CV identification processing.
在第三方面的一种可能实现方式中,其特征在于,所述控制信息为对所述目标运动传感数据进行坐标转换得到的坐标数据,所述坐标数据用于控制所述显示屏中光标的显示位置,或者,所述控制信息为对所述目标运动传感数据进行映射得到的手势标识结果,所述手势标识结果用于操作所述显示屏的界面元素。In a possible implementation manner of the third aspect, the control information is coordinate data obtained by performing coordinate transformation on the target motion sensing data, and the coordinate data is used to control a cursor on the display screen or, the control information is a gesture identification result obtained by mapping the target motion sensing data, and the gesture identification result is used to operate the interface element of the display screen.
基于上述技术方案,对异步校准得到的目标运动传感数据进行处理得到的用于控制显示屏的控制信息,可以对显示屏执行多种不同的操作,例如控制显示屏中光标的显示位置、控制显示屏中的界面元素,例如选中、放大、拖动、点击等。Based on the above technical solution, the control information for controlling the display screen obtained by processing the target motion sensing data obtained by asynchronous calibration can perform various operations on the display screen, such as controlling the display position of the cursor in the display screen, controlling Interface elements in the display, such as selection, zoom, drag, click, etc.
在第三方面的一种可能实现方式中,所述显示屏,还用于显示第一提示信息,用于提示用户作出指定肢体动作;此外,所述摄像头,还用于在所述第一时间段之前的初始化过程中获取所述初始化过程中的图像数据,所述初始化过程中的图像数据包括所述用户作出的所述指定肢体动作信息;其中,所述初始化过程中的图像数据的信号特征与所述初始化过程中的运动传感数据的信号特征确定所述时间差,所述时间差用于对齐所述第一约束条件和所述初始运动传感数据,所述初始化过程中的运动传感数据为所述第二电子设备在所述初始化过程中采集得到,In a possible implementation manner of the third aspect, the display screen is further configured to display first prompt information, which is used to prompt the user to make a specified physical action; in addition, the camera is also configured to display first prompt information at the first time The image data in the initialization process is acquired in the initialization process before the segment, and the image data in the initialization process includes the specified body motion information made by the user; wherein, the signal characteristics of the image data in the initialization process determining the time difference with the signal characteristics of the motion sensing data during the initialization, the time difference being used to align the first constraint and the initial motion sensing data, the motion sensing data during the initialization is acquired by the second electronic device during the initialization process,
基于上述技术方案,由于硬件上固有的差异,不同设备(例如包含运动传感器的设备、包含摄像头的设备等)之间难以避免地存在时间差,这种时间差有可能造成显示屏中显示的光标错位或者轨迹不准确。因此,为了消除这种客观存在的时间差的不利影响,可以通过确定出来的时间差对齐所述第一约束条件和所述初始运动传感数据,以消除该时间差所带来的影响。Based on the above technical solutions, due to inherent differences in hardware, there is inevitably a time difference between different devices (such as a device containing a motion sensor, a device containing a camera, etc.), which may cause the cursor displayed on the display screen to be misplaced or The trajectory is not accurate. Therefore, in order to eliminate the adverse effect of the objectively existing time difference, the first constraint condition and the initial motion sensing data may be aligned through the determined time difference, so as to eliminate the effect of the time difference.
在第三方面的一种可能实现方式中,所述摄像头包括深度摄像头、非深度摄像头中的一种至多种摄像头。In a possible implementation manner of the third aspect, the camera includes one or more cameras selected from a depth camera and a non-depth camera.
基于上述技术方案,该摄像头可以包括多种不同的实现,例如深度摄像头、非深度摄像头等,以使得方案适配于不同的应用场景。Based on the above technical solutions, the camera may include various implementations, such as a depth camera, a non-depth camera, etc., so that the solution is suitable for different application scenarios.
需要说明的是,第三方面中第二电子设备所包含的摄像头和显示屏还可以执行第一方面及其任一可能的实现方式中的实现过程,并实现对应的有益效果,此处不一一赘述。It should be noted that the camera and display screen included in the second electronic device in the third aspect can also perform the implementation process in the first aspect and any possible implementation manner thereof, and achieve corresponding beneficial effects, which are not the same here. One more elaboration.
本申请实施例第四方面提供了一种电子设备,包括处理器,所述处理器与所述存储器耦合;所述存储器,用于存储程序;所述处理器,用于执行所述存储器中的程序,使得所述执行设备执行如上述各个方面所述的人机交互方法。A fourth aspect of an embodiment of the present application provides an electronic device, including a processor, where the processor is coupled to the memory; the memory is used to store a program; the processor is used to execute a program in the memory The program causes the execution device to execute the human-computer interaction method described in the above aspects.
需要说明的是,上述人机交互方法所提及的运动传感器可以集成设置于该电子设备中,或者是独立设置于该电子设备外且通过有线/无线的方式连接于该电子设备,此处不做限定。 类似地,上述人机交互方法所提及的摄像头可以集成设置于该电子设备中,或者是独立设置于该电子设备外且通过有线/无线的方式连接于该电子设备,此处不做限定。类似地,上述人机交互方法所提及的显示屏可以集成设置于该电子设备内,或者是独立设置于该电子设备外且通过有线/无线的方式连接于该电子设备,此处不做限定。It should be noted that the motion sensor mentioned in the above-mentioned human-computer interaction method can be integrated in the electronic device, or can be independently provided outside the electronic device and connected to the electronic device in a wired/wireless manner. Do limit. Similarly, the camera mentioned in the above-mentioned human-computer interaction method may be integrated in the electronic device, or independently provided outside the electronic device and connected to the electronic device in a wired/wireless manner, which is not limited here. Similarly, the display screen mentioned in the above-mentioned human-computer interaction method can be integrated in the electronic device, or can be independently provided outside the electronic device and connected to the electronic device in a wired/wireless manner, which is not limited here. .
本申请实施例第五方面提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面及其任一实现方式所述的人机交互方法。A fifth aspect of the embodiments of the present application provides a computer program, which, when running on a computer, enables the computer to execute the human-computer interaction method described in the first aspect and any implementation manner thereof.
本申请实施例第六方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面及其任一实现方式所所述的人机交互方法。A sixth aspect of the embodiments of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when it runs on a computer, causes the computer to execute the first aspect and any implementation thereof method of human-computer interaction.
本申请实施例第七方面提供了一种电路系统,所述电路系统包括处理电路,所述处理电路配置为执行上述第一方面及其任一实现方式所述的人机交互方法。A seventh aspect of an embodiment of the present application provides a circuit system, where the circuit system includes a processing circuit, and the processing circuit is configured to execute the human-computer interaction method described in the first aspect and any implementation manner thereof.
本申请实施例第八方面提供了一种芯片系统,该芯片系统包括处理器,用于支持实现上述第一方面及其任一实现方式中所涉及的功能,例如,发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存服务器或通讯设备必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。An eighth aspect of an embodiment of the present application provides a chip system, where the chip system includes a processor, configured to support implementing the functions involved in the first aspect and any implementation manner thereof, for example, sending or processing the functions involved in the foregoing method. the data and/or information involved. In a possible design, the chip system further includes a memory for storing necessary program instructions and data of the server or the communication device. The chip system may be composed of chips, or may include chips and other discrete devices.
附图说明Description of drawings
图1为人机交互实现的一个示意图;Fig. 1 is a schematic diagram of human-computer interaction realization;
图2为本申请实施例中应用场景的一个示意图;2 is a schematic diagram of an application scenario in an embodiment of the present application;
图3为本申请实施例提供的一种人机交互方法的一个示意图;3 is a schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图4为本申请实施例提供的一种人机交互方法的另一个示意图;4 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图5为本申请实施例提供的一种人机交互方法的另一个示意图;FIG. 5 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图6为本申请实施例提供的一种人机交互方法的另一个示意图;6 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图7为本申请实施例提供的一种人机交互方法的另一个示意图;FIG. 7 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图8为本申请实施例提供的一种人机交互方法的另一个示意图;8 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图9为本申请实施例提供的一种人机交互方法的另一个示意图;FIG. 9 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图10为本申请实施例提供的一种人机交互方法的另一个示意图;10 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图11为本申请实施例提供的一种人机交互方法的另一个示意图;FIG. 11 is another schematic diagram of a human-computer interaction method provided by an embodiment of the present application;
图12为本申请实施例提供的一种第一电子设备的一个示意图;12 is a schematic diagram of a first electronic device according to an embodiment of the present application;
图13为本申请实施例提供的一种第二电子设备的一个示意图。FIG. 13 is a schematic diagram of a second electronic device according to an embodiment of the present application.
具体实施方式Detailed ways
随着科学技术的发展,用户在使用终端设备的进行人机交互的过程中,全场景沉浸式体验已经成为终端设备的发展趋势。其中,全场景沉浸式体验的应用场景有多种,例如通过虚拟现实(virtual reality,VR)、增强现实(augmented reality,AR)或混合现实(mixed reality,MR)等扩展现实(extended reality,XR)技术进行人机交互的应用场景,或者 是,与电脑、电视、智慧屏(或称为大屏)等带有显示屏的设备进行人机交互的应用场景。传统的使用遥控器或者鼠标等设备进行控制以达到人机交互的方式,已经无法满足当下的需求。With the development of science and technology, the full-scene immersive experience has become the development trend of terminal equipment in the process of user-computer interaction using terminal equipment. Among them, there are various application scenarios for full-scene immersive experience, such as extended reality (XR) through virtual reality (VR), augmented reality (AR) or mixed reality (MR), etc. ) technology for human-computer interaction, or an application scenario for human-computer interaction with devices with display screens such as computers, TVs, and smart screens (or large screens). The traditional way of using remote control or mouse and other devices to control to achieve human-computer interaction can no longer meet the current needs.
为了实现全场景沉浸式体验,可以采用多设备融合的技术对人机交互的方式进行一系列改进,在用户体验友好度、设备使用流畅度、易用性等方面下手。其中,用户肢体动作是一种直接的、方便的输入方式,可以使用具备例如惯性测量单元(inertial measurement unit,IMU)等传感器的可穿戴设备作为媒介,采集得到用户肢体(例如,手部、腕部)动作;还可以以摄像头捕捉得到的图像或视频所承载的用户肢体动作信息作为反馈,识别用户的操作意图,从而使用户可以以更加明了、流畅的方式实现人机交互。In order to achieve a full-scene immersive experience, a series of improvements can be made to the way of human-computer interaction by using multi-device fusion technology, starting with user experience friendliness, device use fluency, and ease of use. Among them, the user's limb movement is a direct and convenient input method, and a wearable device with sensors such as an inertial measurement unit (IMU) can be used as a medium to collect the user's limbs (for example, hand, wrist, wrist, etc.). It can also use the user's body movement information carried by the image or video captured by the camera as feedback to identify the user's operation intention, so that the user can realize human-computer interaction in a clearer and smoother way.
以用户与带有显示屏的设备通过空鼠模式(也可以称为空中鼠标模式、隔空操作模式等)进行人机交互的过程为例。一般地,空鼠模式可以是指通过在无线鼠标或无线控制设备中增加传感器(例如陀螺仪、三维重力传感器(3-dimension gravity sensor,3D-Gsensor)等),从而使无线鼠标或无线控制设备不需放在固定的桌面上,即可使得显示屏上的光标跟随用户肢体动作在空中的移动而移动的一种人机交互模式。Take the process of human-computer interaction between a user and a device with a display screen through an air mouse mode (also referred to as an air mouse mode, an air operation mode, etc.) as an example. In general, the air mouse mode can refer to the wireless mouse or wireless control device by adding sensors (such as gyroscope, 3-dimension gravity sensor, 3D-Gsensor, etc.) to the wireless mouse or wireless control device. A human-computer interaction mode in which the cursor on the display screen can follow the movement of the user's limbs in the air without being placed on a fixed desktop.
进一步地,空鼠模式还可以扩展到通过终端设备(例如手表、手环等可穿戴设备)控制显示屏上的光标(进行移动、拖动、放大、点击等操作)的人机交互模式。其中,上述光标可以是任意形状、大小、透明度的图标或图形。目前空鼠模式主要包括如下几种主流实现方式:Further, the air mouse mode can also be extended to a human-computer interaction mode in which a terminal device (such as a wearable device such as a watch and a wristband) is used to control the cursor on the display screen (for operations such as moving, dragging, zooming in, clicking, etc.). Wherein, the above-mentioned cursor may be an icon or figure of any shape, size, and transparency. At present, the air mouse mode mainly includes the following mainstream implementation methods:
方式一、使用具备IMU的可穿戴设备对用户肢体动作进行识别,并根据识别结果控制显示屏上的光标,以实现人机交互。Method 1: Use a wearable device with an IMU to recognize the user's body movements, and control the cursor on the display screen according to the recognition result, so as to realize human-computer interaction.
具体地,IMU包括的主要元件有陀螺仪、加速度计和磁力计等。其中,陀螺仪可以检测可穿戴设备相对于导航坐标系(例如地固坐标系、地理坐标系等)的角速度信号,加速度计可以检测可穿戴设备在载体坐标系统的三轴(例如以可穿戴设备的坐标原点为载体中心原点坐标,三轴包括沿载体左右方向轴线、沿载体前后方向轴线、沿载体上下方向轴线)的加速度信号,而磁力计能获得智能手表周围磁场的信息。Specifically, the main components included in the IMU include a gyroscope, an accelerometer, and a magnetometer. Among them, the gyroscope can detect the angular velocity signal of the wearable device relative to the navigation coordinate system (such as the ground-fixed coordinate system, the geographic coordinate system, etc.), and the accelerometer can detect the three-axis of the wearable device in the carrier coordinate system (such as the wearable device The origin of the coordinates is the origin of the center of the carrier, and the three axes include the acceleration signals along the left and right directions of the carrier, the front and rear axes of the carrier, and the up and down axes of the carrier), and the magnetometer can obtain the information of the magnetic field around the smart watch.
IMU主要的作用便是将陀螺仪、加速度计和磁力计三个传感器的数据融合,经过姿态解算算法的处理得到较为准确的姿态信息,并基于该姿态信息对用户肢体动作进行识别。一般地,姿态解算算法可以包括mahony算法、卡尔曼滤波算法等。这种实现方式对算力要求较低,实时性好,使得显示屏上光标位置刷新快、跟踪流畅。The main function of the IMU is to fuse the data of the three sensors, the gyroscope, the accelerometer and the magnetometer, and obtain more accurate attitude information through the processing of the attitude calculation algorithm, and recognize the user's limb movements based on the attitude information. Generally, the attitude calculation algorithm may include a mahony algorithm, a Kalman filter algorithm, and the like. This implementation method requires less computing power and has good real-time performance, which makes the cursor position on the display screen refresh quickly and track smoothly.
在方式一中,由于IMU计算过程中存在一定的空间平移跟踪误差,而姿态解算算法在持续积分的计算过程中无法消除该误差,容易导致该误差的积累使得显示屏的光标位置产生严重的漂移。因此,仅使用IMU无法对用户肢体动作在空间上的平移做出精确跟踪。In method 1, there is a certain spatial translation tracking error in the IMU calculation process, and the attitude calculation algorithm cannot eliminate this error in the calculation process of continuous integration, which is easy to cause the accumulation of this error and cause serious errors in the cursor position of the display screen. drift. Therefore, the spatial translation of the user's limb movements cannot be accurately tracked using the IMU alone.
方式二、通过计算机视觉(computer vision,CV)识别技术对用户肢体动作中的特定手势进行识别,并根据识别结果控制显示屏上的光标,以实现人机交互。The second method is to recognize specific gestures in the user's body movements through computer vision (computer vision, CV) recognition technology, and control the cursor on the display screen according to the recognition result, so as to realize human-computer interaction.
具体地,首先通过多种设备采集用户肢体动作的相关信息,例如用户执行的特定手势(例如:上挥,下挥等),并根据该特定手势控制显示屏上的光标。例如,多种设备可以包括摄像头(例如深度摄像头或非深度摄像头等)和/或其它传感器(例如光电容积脉搏波描记 法(photo plethysmo graphy,PPG)中的光电传感器、红外线、雷达等)。其中,此处以摄像头为深度摄像头作为示例,深度摄像头可以采集用户执行的特定手势的图像信息,处理器基于该图像信息使用预先建立的图像识别模型进行CV识别,得到第一判定结果;其它传感器采集用户执行的特定手势的传感信号,处理器基于该传感信号使用预先建立的传感器识别模型进行识别,得到第二判定结果;此后,处理器基于第一判定结果和第二判定结果进行融合处理,以确定用户执行的特定手势之后,处理器根据该特定手势对应的控制操作对显示屏中的光标进行控制。其中,图像识别模型和传感器识别模型均包括特定手势与显示屏中控制操作的对应关系。Specifically, first collect relevant information of the user's body movements through various devices, such as specific gestures performed by the user (eg, swipe up, swipe down, etc.), and control the cursor on the display screen according to the specific gesture. For example, various devices may include cameras (e.g., depth cameras or non-depth cameras, etc.) and/or other sensors (e.g., photosensors in photoplethysmography (PPG), infrared, radar, etc.). Here, the camera is taken as an example of a depth camera. The depth camera can collect image information of a specific gesture performed by the user, and the processor uses a pre-established image recognition model to perform CV recognition based on the image information to obtain the first judgment result; other sensors collect The sensor signal of the specific gesture performed by the user, the processor uses the pre-established sensor recognition model to recognize based on the sensor signal, and obtains the second determination result; after that, the processor performs fusion processing based on the first determination result and the second determination result , after determining the specific gesture performed by the user, the processor controls the cursor in the display screen according to the control operation corresponding to the specific gesture. The image recognition model and the sensor recognition model both include the correspondence between specific gestures and control operations on the display screen.
在方式二中,由于深度摄像头的处理过程与其它传感器的处理过程相比,前者的计算量远远大于后者的计算量,导致两者的计算时长存在较大的差距。也就是说,手势识别的控制操作依赖于深度传感器进行CV识别的较长的处理时长,容易造成手势识别的控制操作对应的光标显示卡顿、显示延迟等问题,导致用户体验度较差。In the second method, since the processing process of the depth camera is compared with the processing process of other sensors, the calculation amount of the former is much larger than that of the latter, resulting in a large gap between the calculation time of the two. That is to say, the control operation of gesture recognition depends on the long processing time of the depth sensor for CV recognition, which is easy to cause problems such as the cursor display freeze and display delay corresponding to the control operation of gesture recognition, resulting in poor user experience.
方式三、将方式一中使用IMU定位的方案和方式二中使用CV识别的方案进行融合,以实现实时校准的人机交互方式。Mode 3: Integrate the solution using IMU positioning in mode 1 and the solution using CV recognition in mode 2 to realize a human-computer interaction mode of real-time calibration.
具体地,在方式三中,通过包含有IMU的设备对用户肢体动作的姿态信息进行识别,得到初始定位信息;同时,通过包含有摄像头(类似于方式二,摄像头可以是深度摄像头或非深度摄像头,此处以摄像头为深度摄像头作为示例)的设备对用户肢体动作的图像信息进行识别定位,得到校准信息;此后,根据校准信息对初始定位信息进行校准,得到校准结果,并基于该校准结果对显示屏上的光标进行操作,以实现人机交互。该方法能够通过该校准过程减小IMU计算过程中存在的空间平移跟踪误差,实现实时跟踪用户肢体动作。Specifically, in the third mode, the device including the IMU identifies the gesture information of the user's limb movements to obtain the initial positioning information; at the same time, by including a camera (similar to the second mode, the camera may be a depth camera or a non-depth camera). , where the camera is used as a depth camera as an example) to identify and locate the image information of the user's limb movements to obtain calibration information; after that, the initial positioning information is calibrated according to the calibration information to obtain a calibration result, and based on the calibration result, the display The cursor on the screen is operated to realize human-computer interaction. The method can reduce the spatial translation tracking error existing in the IMU calculation process through the calibration process, and realize real-time tracking of the user's limb movements.
示例性的,此处以图1所示实现过程为例,说明该实时校准的人机交互方式。Exemplarily, the implementation process shown in FIG. 1 is used as an example to illustrate the human-computer interaction mode of the real-time calibration.
如图1所示,在使用多设备实时融合技术进行人机交互时,包括如下步骤。As shown in Figure 1, when using the multi-device real-time fusion technology for human-computer interaction, the following steps are included.
S1.开机初始化;S1. Boot initialization;
S2.用户使用包含IMU的设备进行操作,形成空鼠移动轨迹,使得包含IMU的设备采集得到IMU数据;S2. The user operates the device containing the IMU to form the movement trajectory of the air mouse, so that the device containing the IMU can collect the IMU data;
S3.通过深度摄像头采集图像数据,并基于图像数据进行CV识别处理,得到的CV识别结果作为步骤S4实时校准的依据;S3. Collect image data through the depth camera, and perform CV recognition processing based on the image data, and the obtained CV recognition result is used as the basis for real-time calibration in step S4;
S4.结合步骤S3中得到的CV识别结果、步骤S2中包含IMU的设备得到的IMU数据,实时追踪并确定偏移角数据;S4. In conjunction with the CV identification result obtained in step S3, the IMU data obtained by the device comprising the IMU in step S2, track and determine the offset angle data in real time;
S5.对步骤S4得到的偏移角数据进行坐标转换,得到映射至屏幕的坐标数据;S5. Coordinate transformation is performed on the offset angle data obtained in step S4 to obtain coordinate data mapped to the screen;
S6.将步骤S5得到的坐标数据(X,Y)在屏幕上实时显示,或者,也可以不显示坐标数据(X,Y),而是在坐标数据(X,Y)对应的屏幕位置处,显示光标(其中,光标可以是任意大小、形状、透明度的图标/图形/图像)。S6. display the coordinate data (X, Y) obtained in step S5 on the screen in real time, or, instead of displaying the coordinate data (X, Y), at the screen position corresponding to the coordinate data (X, Y), Display the cursor (where the cursor can be an icon/graphic/image of any size, shape, transparency).
然而,在图1所示实现过程中,由于IMU数据采集的响应时间不同于CV识别处理的响应时间,导致方式三的实现过程存在严重延迟、滞后,光标位置不稳定,用户体验性能不好。However, in the implementation process shown in Figure 1, since the response time of the IMU data collection is different from the response time of the CV recognition processing, the implementation process of Mode 3 has serious delay and lag, the cursor position is unstable, and the user experience performance is not good.
显然,方式三虽然一定程度上可以解决方式一仅使用IMU进行光标定位所引起的定位 不准的问题,但是,由于包含深度摄像头的设备对CV识别的处理过程与包含IMU的设备对IMU数据的处理过程相比,前者的计算量远远大于后者的计算量,导致两者的计算时长存在较大的差距。也就是说,光标在显示屏上显示的每一帧完全依赖CV的实时校准,而受限于硬件计算能力限制,CV识别过程每次计算的时长(一般为几百毫秒(ms)),大于基于IMU采集的数据形成空鼠移动轨迹的每次计算的时长(一般为几毫秒或十几毫秒),使得实时校准的人机交互方式(即方式三)需要等待CV识别过程较长时长的处理。Obviously, although the third method can solve the problem of inaccurate positioning caused by using the IMU only for cursor positioning to a certain extent, the processing process of the CV recognition by the device including the depth camera is different from that of the device including the IMU to the IMU data. Compared with the processing process, the calculation amount of the former is much larger than that of the latter, resulting in a large gap between the calculation time of the two. That is to say, each frame displayed by the cursor on the display screen is completely dependent on the real-time calibration of CV, and is limited by the hardware computing capability. The duration of each calculation (usually a few milliseconds or more than ten milliseconds) of the air mouse movement trajectory is formed based on the data collected by the IMU, so that the human-computer interaction method of real-time calibration (ie method 3) needs to wait for the CV identification process for a long time. .
示例性的,此处以IMU数据每次处理时长为10毫秒,且CV识别过程每次处理时长为200毫秒为例,在图1所示示例中,用户进行空鼠操作的某一秒钟的时长区间内,该时长区间记为(0,1000],此处描述及后续描述的时长区间单位均为毫秒。在该示例中的一秒钟对应的时长区间内,步骤S3中CV识别结果为(200,400,600,800,1000)共5个时刻的CV关键点数据,为了实现基于CV识别结果对IMU数据的同步校准,需要限定步骤S2中采集IMU数据的时刻与该5个时刻相同。使得在步骤S6中,在屏幕上实时显示的光标数据仅有5个时刻的数据,即屏幕上光标的刷新频率最多仅能与CV识别频率相同为5赫兹(Hz),容易造成光标显示卡顿、显示延迟等问题,导致用户体验度较差。Exemplarily, the processing time of the IMU data is 10 milliseconds each time, and the processing time of each CV identification process is 200 milliseconds. In the interval, the duration interval is denoted as (0,1000], and the duration interval unit described here and the subsequent description is milliseconds. In the duration interval corresponding to one second in this example, the CV identification result in step S3 is ( 200, 400, 600, 800, 1000) of the CV key point data at a total of 5 moments, in order to realize the synchronous calibration of the IMU data based on the CV identification result, it is necessary to limit the moment when the IMU data is collected in step S2 to be the same as the 5 moments. Make in step S6, The cursor data displayed in real time on the screen is only 5 times of data, that is, the refresh frequency of the cursor on the screen can only be the same as the CV recognition frequency, which is at most 5 Hz (Hz), which is likely to cause problems such as cursor display freezes and display delays. lead to poor user experience.
此外,在方式二和方式三中,不同设备(例如包含深度摄像头的设备和包含IMU的设备)在采集不同数据的过程中,由于不同设备的采样精度可能不同,导致不同设备所采集的数据对应于不同的时间戳,一般会存在毫秒级别的时间差,该时间差也容易导致方式三中出现校准不准确,造成显示屏上的光标显示错位等问题,也是导致用户体验度差的原因之一。In addition, in ways 2 and 3, in the process of collecting different data for different devices (such as a device including a depth camera and a device including an IMU), since the sampling accuracy of different devices may be different, the data collected by different devices may correspond to For different time stamps, there is generally a millisecond-level time difference. This time difference can easily lead to inaccurate calibration in Mode 3, resulting in misplaced display of the cursor on the display screen, which is also one of the reasons for poor user experience.
为了解决上述问题,本申请实施例提供了一种人机交互方法及相关设备,用于通过异步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。In order to solve the above problem, the embodiments of the present application provide a human-computer interaction method and related equipment, which are used to control the cursor in the display device through asynchronous calibration, which can improve the continuity of the cursor movement in the display device. Thereby improving the user experience.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
本实施例及后续实施例中,运动传感器仅以IMU数据采集装置为例,显然该运动传感器还可以是其它的装置,例如加速度计数据采集装置、包含有陀螺仪数据采集装置、磁力计数据采集装置,或者是其他的装置,此处不做限定。In this embodiment and subsequent embodiments, the motion sensor is only taken as an example of an IMU data acquisition device. Obviously, the motion sensor may also be other devices, such as an accelerometer data acquisition device, a gyroscope data acquisition device, and a magnetometer data acquisition device. device, or other devices, which are not limited here.
可以理解的是,显示屏可以包含于显示装置,显示装置还可以包括用于承载显示屏的底座、用于控制显示屏参数(例如亮度、对比度等)的物理按钮或者触摸屏按钮、为显示屏提供电能的电源、用于为显示屏传输控制指令的有线/无线的通信模块等,此处不做限定,后文实施例中所提及的显示装置主要用于实现显示屏的显示功能。It can be understood that the display screen can be included in the display device, and the display device can also include a base for carrying the display screen, physical buttons or touch screen buttons for controlling the parameters of the display screen (such as brightness, contrast, etc.) The power supply of electrical energy, the wired/wireless communication module for transmitting control instructions for the display screen, etc. are not limited here, and the display device mentioned in the following embodiments is mainly used to realize the display function of the display screen.
可以理解的是,摄像头可以包含于图像采集装置,图像采集装置还可以包括为摄像头提供电能的电源、用于为摄像头传输控制指令的有线/无线的通信模块等,此处不做限定。后文实施例中所提及的图像采集装置主要用于实现摄像头的拍摄(图像或视频)功能。It can be understood that the camera may be included in the image acquisition device, and the image acquisition device may also include a power supply for providing power to the camera, a wired/wireless communication module for transmitting control instructions for the camera, etc., which are not limited here. The image acquisition device mentioned in the following embodiments is mainly used to realize the shooting (image or video) function of the camera.
请参阅图2,为本申请实施例应用场景的一个示意图,该应用场景中至少包括图像采集装置、显示装置和IMU数据采集装置。其中,IMU数据采集装置用于采集IMU数据,可以包含于具备IMU的终端设备,如手机、遥控设备(如遥控器、手柄等)、平板电脑、可穿戴设备(如智能手表、智能手环等)等。显示装置为画面输出装置,可以包含于具备显示 屏的设备,例如电脑、电视、智慧屏(或称为大屏)等;图像采集装置用于采集图像数据(包括一帧或多帧图像信息、或者包含有多帧图像的视频流等),可以为摄像头,例如深度摄像头、非深度摄像头,或者是其它的图像采集设备,此处不做限定。Please refer to FIG. 2 , which is a schematic diagram of an application scenario of an embodiment of the present application, where the application scenario at least includes an image acquisition device, a display device, and an IMU data acquisition device. Among them, the IMU data acquisition device is used to collect IMU data, and can be included in terminal devices with IMU, such as mobile phones, remote control devices (such as remote controls, handles, etc.), tablet computers, and wearable devices (such as smart watches, smart bracelets, etc. )Wait. The display device is a picture output device, which can be included in a device with a display screen, such as a computer, a TV, a smart screen (or a large screen), etc. The image acquisition device is used to collect image data (including one or more frames of image information, Or a video stream containing multiple frames of images, etc.), which can be a camera, such as a depth camera, a non-depth camera, or other image acquisition devices, which are not limited here.
可选地,该图像采集装置和显示装置可以集成在同一设备中,例如电脑、电视、智慧屏(或称为大屏)等。Optionally, the image capturing device and the display device may be integrated in the same device, such as a computer, a TV, a smart screen (or a large screen), and the like.
此外,本申请实施例中,用于执行该人机交互方法中的异步校准过程的电子设备包括处理器,该电子设备可以有多种实现方式。例如,电子设备可以是包含IMU数据采集装置的设备,通过有线/无线的方式与包含图像采集装置和/或显示装置的一个或多个设备连接;或者,该电子设备也可以是包含显示屏的设备,通过有线/无线的方式与包含图像采集装置和/或IMU数据采集装置的一个或多个设备连接;或者,该电子设备也可以是包含图像采集装置的设备,通过有线/无线的方式与包含显示装置和/或IMU数据采集装置的一个或多个设备连接;或者,该电子设备也可以是其它设备,即不包含图像采集装置、显示装置和IMU数据采集装置的设备(例如智能音箱、机器人、服务器、计算中枢等),通过有线/无线的方式与包含图像采集装置、显示装置和/或IMU数据采集装置的一个或多个设备连接。电子设备可以通过有线/无线的方式接收一个或多个设备发送的数据、向一个或多个设备发送数据,电子设备的处理器可以处理本设备采集的数据以及接收到的数据。In addition, in the embodiment of the present application, the electronic device for executing the asynchronous calibration process in the human-computer interaction method includes a processor, and the electronic device may have multiple implementations. For example, the electronic device may be a device including an IMU data acquisition device, and is connected to one or more devices including an image capture device and/or a display device in a wired/wireless manner; or, the electronic device may also be a device including a display screen equipment, which is connected to one or more devices including an image acquisition device and/or an IMU data acquisition device by wired/wireless means; One or more devices including a display device and/or an IMU data acquisition device are connected; alternatively, the electronic device may also be other devices, that is, a device that does not include an image capture device, a display device, and an IMU data acquisition device (such as smart speakers, robot, server, computing center, etc.), connected with one or more devices including an image acquisition device, a display device and/or an IMU data acquisition device through a wired/wireless manner. The electronic device can receive data sent by one or more devices and send data to one or more devices in a wired/wireless manner, and the processor of the electronic device can process the data collected and received by the device.
在一种典型应用场景中,用户手持手机与大屏进行交互。该场景中,手机为包含IMU数据采集装置的设备,大屏为包含显示装置的设备。手机、大屏分别执行本申请实施例提供的人机交互方法中的相关步骤,即可实现用户手持手机作出肢体动作、带动手机移动,大屏响应于手机移动的轨迹,在大屏上的相应位置显示光标的效果。In a typical application scenario, a user interacts with a large screen by holding a mobile phone. In this scenario, the mobile phone is a device that includes an IMU data acquisition device, and the large screen is a device that includes a display device. The mobile phone and the large screen respectively execute the relevant steps in the human-computer interaction method provided by the embodiments of the present application, so that the user can hold the mobile phone to make physical actions and drive the mobile phone to move, and the large screen responds to the trajectory of the mobile phone movement, and the corresponding response on the large screen can be realized. The position shows the effect of the cursor.
在上述典型应用场景中,由于手机的处理器的性能通常优于大屏的处理器的性能,因此本申请实施例提供的人机交互方法中的异步校准过程,优选地,可以由手机的处理器执行。当然,该异步校准过程也可以由大屏的处理器执行,或者由其他设备(例如服务器、计算中枢等)的处理器执行,本申请对此不做限定。In the above typical application scenarios, since the performance of the processor of the mobile phone is usually better than that of the processor of the large screen, the asynchronous calibration process in the human-computer interaction method provided by the embodiment of the present application can preferably be processed by the mobile phone. device executes. Of course, the asynchronous calibration process may also be performed by a processor of a large screen, or performed by a processor of other devices (eg, a server, a computing center, etc.), which is not limited in this application.
此外,本申请实施例提供的人机交互方法的应用场景还包括但不限于:用户佩戴手表/手环与大屏进行交互、用户穿戴运动传感器与大屏进行交互、用户手持遥控器与头戴式显示设备进行交互等。应理解,任意一种包含IMU数据采集装置的设备和任意一种包含显示装置的设备构成的任意组合进行人机交互,均可以采用本申请实施例提供的人机交互方法。In addition, the application scenarios of the human-computer interaction method provided by the embodiments of the present application also include but are not limited to: the user wears a watch/bracelet to interact with the large screen, the user wears a motion sensor to interact with the large screen, the user holds a remote control and the head wears interactive display devices. It should be understood that any combination of any device including an IMU data acquisition device and any device including a display device can use the human-computer interaction method provided by the embodiments of the present application to perform human-computer interaction.
需要说明的是,后续实施例中,以该电子设备为具备IMU数据采集装置的设备,且图像采集装置显示装置均集成设置于另一设备(例如电脑、电视、智慧屏等)为例,即该电子设备包含有该IMU数据采集装置,并通过有线/无线连接的方式与另一设备进行通信,以获取得到图像采集装置采集得到的图像数据或图像采集装置根据采集得到的图像数据进行处理的处理结果,使得该电子设备经过本申请提供的人机交互方法的实现过程,得到并向显示装置发送用于控制显示装置中光标的控制信息。It should be noted that, in the following embodiments, the electronic device is a device equipped with an IMU data acquisition device, and the display device of the image acquisition device is integrated with another device (such as a computer, a TV, a smart screen, etc.) as an example, that is, The electronic device includes the IMU data acquisition device, and communicates with another device through a wired/wireless connection to acquire image data acquired by the image acquisition device or processed by the image acquisition device according to the acquired image data. The processing result enables the electronic device to obtain and send the control information for controlling the cursor in the display device to the display device through the implementation process of the human-computer interaction method provided by the present application.
在图2所示应用场景中,用户的手部(包括手指、手腕、手心、手掌等)以手持或穿戴的方式携带包含有IMU数据采集装置的设备。用户在图像采集装置的拍摄区域内移动手部,带动IMU数据采集装置移动。在此过程中,IMU数据采集装置采集IMU数据、图像采 集装置采集图像数据,该电子设备获取得到IMU数据和图像数据并进行融合处理得到空鼠数据,进而可以基于该空鼠数据在显示装置上显示光标。从而用户可以通过移动手部的方式,以空鼠模式实现与显示装置的交互。例如用户可以隔空选中、移动、拖动、放大、点击显示装置的显示界面中的界面元素。In the application scenario shown in FIG. 2 , the user's hand (including fingers, wrist, palm, palm, etc.) carries the device including the IMU data acquisition device in a handheld or wearable manner. The user moves his hand in the shooting area of the image acquisition device, which drives the IMU data acquisition device to move. In this process, the IMU data collection device collects IMU data, the image collection device collects image data, the electronic device obtains the IMU data and image data, and performs fusion processing to obtain air mouse data, which can then be displayed on the display device based on the air mouse data. Display the cursor. Therefore, the user can interact with the display device in the air mouse mode by moving the hand. For example, the user can select, move, drag, zoom in, and click on interface elements in the display interface of the display device in the air.
如前所述,由于硬件上固有的差异,不同设备之间难以避免地存在时间差,这种时间差通常是微秒级、毫秒级的。这种时间差的存在会对本申请实施例提供的人机交互方法中异步校准的过程产生较大影响,使得最终显示的光标错位或者轨迹不准确。As mentioned earlier, due to the inherent differences in hardware, there is inevitably a time difference between different devices, and this time difference is usually in the order of microseconds and milliseconds. The existence of such a time difference will have a great impact on the asynchronous calibration process in the human-computer interaction method provided in the embodiment of the present application, so that the final displayed cursor is misplaced or the trajectory is inaccurate.
因此,为了消除这种客观存在的时间差的不利影响,在实施本申请实施例提供的人机交互方法之前,可以首先进行初始化,计算得到这个时间差。然后在后续实施本申请实施例提供的人机交互方法时,根据初始化过程中计算得到的时间差,对齐IMU数据采集装置采集的数据和图像采集装置采集的数据,从而使得最终显示的光标不错位、轨迹准确。Therefore, in order to eliminate the adverse effect of the objectively existing time difference, before implementing the human-computer interaction method provided by the embodiment of the present application, initialization may be performed first, and the time difference may be obtained by calculation. Then, in the subsequent implementation of the human-computer interaction method provided by the embodiment of the present application, according to the time difference calculated in the initialization process, the data collected by the IMU data collection device and the data collected by the image collection device are aligned, so that the final displayed cursor is not in place, The track is accurate.
上述初始化的过程,可以是用户面对显示装置作出指定肢体动作的过程。例如,如图4所示,用户佩戴手表,显示屏显示文字“请面对屏幕”用于提示用户调整相对于显示屏的站位、显示屏显示文字“请绘制W曲线”用于提示用户作出指定肢体动作,用户移动手臂,用手臂在空中绘制W曲线。从而,手表处理器可以根据用户绘制W曲线过程中采集到的IMU数据和摄像头采集的图像数据(或基于图像数据得到的识别结果),计算得到IMU和摄像头硬件上固有的时间差。The above initialization process may be a process in which the user performs a specified physical action facing the display device. For example, as shown in Figure 4, when a user wears a watch, the display screen displays the text "please face the screen" to prompt the user to adjust the position relative to the display screen, and the display screen displays the text "please draw a W curve" to prompt the user to make Specifying a body motion, the user moves the arm, drawing a W curve in the air with the arm. Therefore, the watch processor can calculate the inherent time difference between the IMU and the camera hardware according to the IMU data collected during the user's drawing of the W curve and the image data collected by the camera (or the recognition result obtained based on the image data).
以上述应用场景作为示例,用户通过手持或穿戴的方式携带具备IMU数据采集装置的电子设备,且该用户处于图像采集装置的拍摄区域时,首次进入(或断开重连)与显示装置的人机交互过程时,该电子设备可以通过下述图3所示人机交互方法实现初始化过程,以对齐IMU数据采集装置和图像采集装置之间的时间信息(也可以称为时间戳或时间轴)。图3所示实现方式可以用于避免前述方式二和方式三中,由于不同设备所采集的数据之间存在的时间差所带来的校准不准确,造成光标显示错位等问题。Taking the above application scenario as an example, the user carries an electronic device with an IMU data acquisition device by hand or wears it, and when the user is in the shooting area of the image acquisition device, the person who enters (or disconnects and reconnects) the display device for the first time. During the computer interaction process, the electronic device can realize the initialization process through the human-computer interaction method shown in the following Figure 3, so as to align the time information (also called time stamp or time axis) between the IMU data acquisition device and the image acquisition device. . The implementation shown in FIG. 3 can be used to avoid problems such as inaccurate calibration caused by the time difference between data collected by different devices in the aforementioned second and third modes, resulting in misplaced cursor display.
下面将首先对图3所示实现过程进行具体的描述。The following will first describe the implementation process shown in FIG. 3 in detail.
请参阅图3,为本申请实施例提供的一种人机交互方法的一个流程示意图,该方法包括如下步骤。Please refer to FIG. 3 , which is a schematic flowchart of a human-computer interaction method according to an embodiment of the present application. The method includes the following steps.
S101.确定第一时长内的初始化IMU数据和初始化图像数据。S101. Determine initialized IMU data and initialized image data within a first duration.
本实施例中,在初始化过程对应的第一时长中,当用户通过手持或穿戴的方式携带具备IMU数据采集装置的电子设备,且该用户处于图像采集装置的拍摄区域时,电子设备通过IMU数据采集装置在第一时长内采集用户肢体动作得到初始化IMU数据,并且,图像采集装置在该第一时长内采集用户肢体动作得到初始化图像数据。In this embodiment, in the first duration corresponding to the initialization process, when the user carries the electronic device equipped with the IMU data acquisition device in a handheld or wearable manner, and the user is in the shooting area of the image acquisition device, the electronic device passes the IMU data The acquisition device acquires the initialized IMU data by acquiring the user's limb movements within the first time period, and the image acquisition device acquires the initialization image data by acquiring the user's limb motion in the first period of time.
其中,电子设备在步骤S101中可以通过与图像采集装置之间的有线/无线的通信方式,接收得到该初始化图像数据。Wherein, in step S101, the electronic device may receive the initialization image data through wired/wireless communication with the image acquisition device.
需要说明的是,在步骤S101中,IMU数据采集装置和图像采集装置这两个设备的采样频率可以是相同的,例如采样频率均为100赫兹(Hz),即在步骤S101中两个设备在第一时长中任一秒钟之内可以采集得到100个时刻的IMU数据以及100个时刻的图像数据。或者,在步骤S101中,IMU数据采集装置和图像采集装置这两个设备的采样频率也可以是不 同的,例如IMU数据采集设备的采样频率均为100赫兹(Hz)且图像采集装置的采样频率为5Hz,即在步骤S101中两个设备在第一时长中任一秒钟之内可以采集得到100个时刻的IMU数据以及5个时刻的图像数据。It should be noted that, in step S101, the sampling frequencies of the two devices, the IMU data acquisition device and the image acquisition device, may be the same, for example, the sampling frequencies are both 100 hertz (Hz), that is, in step S101, the two devices are in IMU data of 100 moments and image data of 100 moments can be collected within any second in the first duration. Alternatively, in step S101, the sampling frequencies of the IMU data acquisition device and the image acquisition device may also be different, for example, the sampling frequencies of the IMU data acquisition device are both 100 hertz (Hz) and the sampling frequency of the image acquisition device is 5 Hz, that is, in step S101 , the two devices can acquire the IMU data of 100 moments and the image data of 5 moments within any second of the first duration.
S102.根据初始化IMU数据和初始化图像数据确定IMU数据采集装置的时间戳与图像采集装置的时间戳之间的差值。S102. Determine the difference between the timestamp of the IMU data acquisition device and the timestamp of the image acquisition device according to the initialization IMU data and the initialization image data.
本实施例中,电子设备在步骤S102中根据步骤S101中得到的始化IMU数据和初始化图像数据确定IMU数据采集装置的时间戳与图像采集装置的时间戳之间的差值,在初始化过程之后,可以依据该差值对齐IMU数据采集装置与图像采集装置的时间戳。In this embodiment, the electronic device determines the difference between the time stamp of the IMU data acquisition device and the time stamp of the image acquisition device according to the initialized IMU data and the initialized image data obtained in step S101 in step S102, after the initialization process , the time stamps of the IMU data acquisition device and the image acquisition device can be aligned according to the difference.
在一种可能的实现方式中,在步骤S102中,电子设备可以根据初始化IMU数据和初始化图像数据,分析该初始化图像数据和初始化IMU数据的信号特征,可以是计算波动频率来确定该差值,也可以是回归线性拟合确定该差值等,此处不做限定。In a possible implementation manner, in step S102, the electronic device may analyze the signal characteristics of the initialization image data and the initialization IMU data according to the initialization IMU data and the initialization image data, and may calculate the fluctuation frequency to determine the difference value, The difference may also be determined by regression linear fitting, which is not limited here.
示例性的,此处以电子设备在步骤S102中通过计算波动频率的方式来确定该差值作为示例进行说明。其中,电子设备可以是根据初始化IMU数据得到在第一时长内的IMU数据波动曲线,并确定该IMU数据波动曲线中的波峰位置(和/或波谷位置);并且,该电子设备根据初始化图像数据的CV关键点检测结果确定在第一时长内的CV数据波动曲线,并确定该CV数据波动曲线中的波峰位置(和/或波谷位置),该CV关键点检测结果包括用户手持或穿戴该电子设备的位置(例如左腕、右腕等)的三维方向角。此后,该电子设备根据IMU数据波动曲线中波峰位置(和/或波谷位置)的时刻信息,以及CV数据波动曲线中波峰位置(和/或波谷位置)的时刻信息进行比较,得到的时刻信息差值即为IMU数据采集装置的时间戳与图像采集装置的时间戳之间的差值。从而,在步骤S102之后,可以依据该差值对IMU数据采集装置采集得到的IMU数据与图像采集装置的图像数据进行对齐。Exemplarily, the electronic device determines the difference by calculating the fluctuation frequency in step S102 as an example for description. Wherein, the electronic device may obtain the IMU data fluctuation curve within the first time period according to the initialized IMU data, and determine the peak position (and/or the wave trough position) in the IMU data fluctuation curve; The CV key point detection result determines the CV data fluctuation curve within the first time period, and determines the peak position (and/or the wave trough position) in the CV data fluctuation curve, and the CV key point detection result includes the user holding or wearing the electronic device. The three-dimensional orientation angle of the device's location (eg, left wrist, right wrist, etc.). Thereafter, the electronic device compares the time information of the peak position (and/or the trough position) in the IMU data fluctuation curve with the time information of the peak position (and/or the trough position) in the CV data fluctuation curve, and obtains the difference in time information. The value is the difference between the timestamp of the IMU data acquisition device and the timestamp of the image acquisition device. Therefore, after step S102, the IMU data collected by the IMU data collection device and the image data of the image collection device can be aligned according to the difference.
需要说明的是,上述示例中,图像采集装置可以对初始化图像进行CV关键点识别得到CV关键点检测结果,并在步骤S101中将该CV关键点检测结果发送至电子设备,或者是,图像采集装置在步骤S101中将该初始化图像发送至电子设备,以使得电子设备对该初始化图像进行CV关键点识别得到CV关键点检测结果,此处不做限定。It should be noted that, in the above example, the image acquisition device may perform CV key point recognition on the initialization image to obtain the CV key point detection result, and send the CV key point detection result to the electronic device in step S101, or, image acquisition The apparatus sends the initialization image to the electronic device in step S101, so that the electronic device performs CV key point recognition on the initialization image to obtain a CV key point detection result, which is not limited herein.
在一种可能的实现方式中,用户可以在步骤S101初次进入空鼠操作时进行初始化过程时,电子设备可以基于多种不同的触发方式确定该用户进入初始化过程,并执行步骤S101和步骤S102,例如,该初始化过程可以是基于IMU数据采集装置采集得到用户的特定肢体动作(例如左挥、右挥等)而触发,也可以是该用户处于图像采集装置的拍摄区域时触发,或者是图像采集装置和IMU数据采集设备之间建立通信时触发,或者是其它的触发方式,此处不做限定。In a possible implementation manner, when the user performs the initialization process when the user enters the air mouse operation for the first time in step S101, the electronic device may determine that the user enters the initialization process based on a variety of different triggering methods, and executes steps S101 and S102, For example, the initialization process may be triggered based on the user's specific limb movements (such as left swipe, right swipe, etc.) acquired by the IMU data acquisition device, or may be triggered when the user is in the shooting area of the image acquisition device, or the image acquisition It is triggered when communication is established between the device and the IMU data acquisition device, or other triggering methods, which are not limited here.
在一种可能的实现方式中,电子设备在步骤S101之后,还可以根据初始化IMU数据和初始化图像数据确定用户与图像采集装置之间的初始相对信息,该初始相对信息可以包括距离、朝向等参数。In a possible implementation manner, after step S101, the electronic device may further determine initial relative information between the user and the image acquisition device according to the initialized IMU data and the initialized image data, where the initial relative information may include parameters such as distance and orientation .
示例性的,该CV关键点识别过程可以通过左肩、右肩这两个CV关键点得到肩宽这一参数用于确定用户与图像采集装置之间的距离,以及用户与图像采集装置之间的相对朝向。或者是,通过面部关键点,例如眼间距、耳间距等参数,确定用户与图像采集装置之间的 距离以及用户与图像采集装置之间的相对朝向。例如,电子设备可以根据初始化过程中所采集的的任意一帧图像得到的确定的比例关系,比如:默认身体转动不会导致头部宽度改变,初始化过程中用户头部宽度与肩宽的比值为一个预设的比例系数,后续根据肩宽变化而导致比例系数改变,得出朝向变化。Exemplarily, the CV key point identification process can obtain the shoulder width through the two CV key points of the left shoulder and the right shoulder, which is used to determine the distance between the user and the image acquisition device, and the distance between the user and the image acquisition device. relative orientation. Alternatively, the distance between the user and the image capture device and the relative orientation between the user and the image capture device are determined by using facial key points, such as eye spacing, ear spacing and other parameters. For example, the electronic device can obtain a certain proportional relationship according to any frame of images collected during the initialization process. For example, the default body rotation will not cause the head width to change. During the initialization process, the ratio of the user's head width to the shoulder width is A preset proportional coefficient, and the subsequent proportional coefficient changes according to the change of shoulder width, resulting in the change of orientation.
下面将对确定初始相对信息的实现过程进行示例性说明。The implementation process of determining the initial relative information will be exemplarily described below.
实施例一Example 1
电子设备执行步骤S101所示实现过程即可实现初始化相对信息的确定。用户的手部(包括手指、手腕、手心、手掌等)通过手持或穿戴的方式携带具备IMU数据采集装置的设备,且用户处于图像采集装置的拍摄区域内,通过采集预先设置的操作手势视频流,该视频流包括多帧用户图像,此后,对图像采集装置拍摄的多帧用户图像做CV关键点检测,获取初始相对信息。例如,该初始相对信息可以作为相对信息(例如距离和朝向)的校准参考值,在初始化过程之后的任意时刻中如果采集得到的相对信息不同于该校准参考值时(例如由于用户走动或转身等触发),确定用户与图像采集装置之间的相对信息(例如距离和朝向)发生改变。The determination of the relative initialization information can be achieved by the electronic device executing the implementation process shown in step S101. The user's hands (including fingers, wrists, palms, palms, etc.) carry the equipment equipped with the IMU data acquisition device by holding or wearing, and the user is in the shooting area of the image acquisition device. , the video stream includes multiple frames of user images, and thereafter, CV key point detection is performed on the multiple frames of user images captured by the image acquisition device to obtain initial relative information. For example, the initial relative information can be used as a calibration reference value for relative information (such as distance and orientation), and at any time after the initialization process, if the acquired relative information is different from the calibration reference value (such as due to the user walking or turning around, etc.) trigger), it is determined that the relative information (such as distance and orientation) between the user and the image capture device has changed.
具体地,以该具备IMU数据采集装置的电子设备为包含有IMU的智能手表这一可穿戴设备为例。用户佩戴智能手表正对站在显示屏前面,通过在空中绘制系统设置的包括但不限于W曲线、O曲线等类似轴对称的预设动作,图像采集装置捕获手势视频流数据(至少包括第一图像和第二图像),通过CV关键点检测获得用户初始状态站立距离、身体朝向初始化为正面、肩宽、头部大小等参数作为初始图像数据,并同步对齐IMU采集的预设动作对应的IMU数据。Specifically, take the electronic device provided with the IMU data acquisition device as a wearable device including a smart watch including an IMU as an example. The user wears the smart watch and stands directly in front of the display screen. By drawing the preset motions set by the system in the air, including but not limited to W-curve, O-curve and other similar axisymmetric preset actions, the image acquisition device captures the gesture video stream data (including at least the first one). image and second image), obtain the user's initial state standing distance, body orientation initialized to the front, shoulder width, head size and other parameters through CV key point detection as the initial image data, and synchronously align the IMU corresponding to the preset actions collected by the IMU data.
例如,该初始化过程可以通过图4所示示例实现,该初始化过程包括:显示屏通过界面显示“请面对屏幕”,当图像采集装置检测到用户面对屏幕之后,可以提醒用户执行初始化操作,即,通过界面显示“请绘制W曲线”;此后,在图像采集装置检测到用户已在空中绘制W曲线之后,图像采集装置采集得到用户在绘制W曲线过程中的初始图像数据,IMU数据采集装置采集用户的IMU数据,并对齐该初始图像数据和IMU数据以实现光标位置的初始化。For example, the initialization process can be implemented by the example shown in FIG. 4 . The initialization process includes: the display screen displays “please face the screen” through the interface, and when the image capture device detects that the user is facing the screen, it can remind the user to perform the initialization operation, That is, "please draw the W curve" is displayed on the interface; after that, after the image acquisition device detects that the user has drawn the W curve in the air, the image acquisition device acquires the initial image data of the user in the process of drawing the W curve, and the IMU data acquisition device The user's IMU data is collected, and the initial image data and the IMU data are aligned to initialize the cursor position.
又如,该初始化过程可以通过图5所示示例实现,该初始化过程包括:当图像采集装置检测到用户时默认用户面对屏幕,然后,可以提醒用户执行初始化操作,显示屏通过界面显示“请面对屏幕绘制W曲线”;此后,在图像采集装置检测到用户已在空中绘制W曲线之后,图像采集装置采集得到用户在绘制W曲线过程中的初始图像数据,IMU数据采集装置采集用户的IMU数据,并对齐该初始图像数据和IMU数据以实现光标位置的初始化。For another example, the initialization process can be implemented by the example shown in FIG. 5 , and the initialization process includes: when the image acquisition device detects the user, the user faces the screen by default, and then the user can be reminded to perform the initialization operation, and the display screen displays “Please "Draw W curve facing the screen"; after that, after the image acquisition device detects that the user has drawn the W curve in the air, the image acquisition device collects the initial image data of the user in the process of drawing the W curve, and the IMU data acquisition device collects the user's IMU data, and align this initial image data with the IMU data to initialize the cursor position.
实施例二Embodiment 2
电子设备无需执行步骤S101所示实现过程即可实现初始化相对信息的确定。其中,用户站在显示装置幕前,开机进入空鼠模式,通过单目图像采集装置(例如单目摄像头)测距技术,具体可以是基于相似三角比例计算出对应像素点的实际坐标,初始化模型距离、身形等参数,对齐手表IMU测量数据和显示装置初始图像数据。The electronic device can realize the determination of the initialization relative information without performing the implementation process shown in step S101. Among them, the user stands in front of the screen of the display device, boots into the air mouse mode, and uses a monocular image acquisition device (such as a monocular camera) ranging technology, which can specifically calculate the actual coordinates of the corresponding pixel based on a similar triangular ratio, and initialize the model. Parameters such as distance and body shape are aligned with the IMU measurement data of the watch and the initial image data of the display device.
具体地,用户不用初始化手势即可进入空鼠模式,单目图像采集装置测距技术对摄像 机标定的要求比较高,同时要求镜头本身造成的畸变就比较小,但总体来说这种方法的可移植性和实用性都较强,该方法也可实现初始化参数的准确估计。其中,在实施例二相比于实施例一,无需初始化过程可能会牺牲测距的精度,但是,可以采用深度图像采集装置弥补精度。Specifically, the user can enter the air mouse mode without initializing gestures. The monocular image acquisition device ranging technology has relatively high requirements for camera calibration, and requires that the distortion caused by the lens itself is relatively small, but in general this method can be used. The portability and practicability are strong, and the method can also achieve accurate estimation of initialization parameters. Wherein, compared with the first embodiment, in the second embodiment, the accuracy of the ranging may be sacrificed without the initialization process, but the depth image acquisition device can be used to make up for the accuracy.
即,实施例二中,初始化阶段直接对单目图像采集装置建模,通过单目摄像头的测距原理得到用户站位距离、身形参数(即朝向)。在实施例二中,无需直接使用CV初始化手势视频流中关键点检测得到用户初始化参数,减少操作繁琐程度,节约硬件内存和计算成本。That is, in the second embodiment, the monocular image acquisition device is directly modeled in the initialization stage, and the user's position distance and body shape parameters (ie, orientation) are obtained through the ranging principle of the monocular camera. In the second embodiment, it is not necessary to directly use CV to initialize key points in the gesture video stream to obtain user initialization parameters, which reduces the complexity of operations and saves hardware memory and computing costs.
可选地,在图3所示初始化过程之后,即对齐IMU数据采集装置和图像采集装置的时间信息之后,电子设备可以执行如图6所示的人机交互方法,以实现通过异步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。Optionally, after the initialization process shown in FIG. 3 , that is, after aligning the time information of the IMU data acquisition device and the image acquisition device, the electronic device can execute the human-computer interaction method shown in FIG. 6 to achieve asynchronous calibration. Controlling the cursor in the display device can improve the continuity of the cursor movement in the display device, thereby improving user experience.
可选地,电子设备也可以无需通过图3所示初始化过程而直接执行如图6所示的人机交互方法,例如当IMU数据采集装置和图像采集装置这两个设备之间的采集/处理数据的能力较强时,有可能两个设备之间的采集/处理数据的时间差在微秒级别甚至更低级别的时间差,而该时间差的级别过低,用户在人机交互过程中并无法感知该时间差带来的光标显示影响。从而,在图6所示人机交互方法中,无需图3所示的初始化过程,即可实现通过异步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。Optionally, the electronic device can also directly execute the human-computer interaction method shown in FIG. 6 without going through the initialization process shown in FIG. 3 . When the data capability is strong, the time difference between the two devices for collecting/processing data may be at the microsecond level or even lower, and the level of the time difference is too low for the user to perceive during the human-computer interaction process. The cursor display effect caused by this time difference. Therefore, in the human-computer interaction method shown in FIG. 6, the cursor in the display device can be controlled by asynchronous calibration without the initialization process shown in FIG. 3, and the coherence of the cursor movement in the display device can be improved. to enhance the user experience.
请参阅图6,为本申请实施例提供的一种人机交互方法的一个流程示意图,该方法包括如下步骤。Please refer to FIG. 6 , which is a schematic flowchart of a human-computer interaction method according to an embodiment of the present application. The method includes the following steps.
S201.电子设备确定初始IMU数据。S201. The electronic device determines initial IMU data.
本实施例中,当用户通过手持或穿戴的方式携带具备IMU数据采集装置的电子设备,且该用户处于图像采集装置的拍摄区域时,电子设备通过IMU数据采集装置在第一时刻集合内采集用户肢体动作得到初始IMU数据。In this embodiment, when a user carries an electronic device equipped with an IMU data acquisition device in a handheld or wearable manner, and the user is in the shooting area of the image acquisition device, the electronic device collects the user through the IMU data acquisition device in the first set of moments Body movements get initial IMU data.
需要说明的是,第一时刻集合包含于第一时间段,例如IMU数据采集装置的采样频率为第一采样频率(例如100Hz),则第一时刻集合所包含的多个时刻为第一时间段内每一秒钟内的100个时刻。It should be noted that the first time set is included in the first time period. For example, the sampling frequency of the IMU data acquisition device is the first sampling frequency (for example, 100 Hz), and the multiple times included in the first time set are the first time period. 100 moments in every second.
如前述图2描述可知,此处以该电子设备包含该IMU数据采集装置为例进行说明,例如,该电子设备为手机、遥控设备(如遥控器、手柄等)、平板电脑、可穿戴设备(如智能手表、智能手环等)等。As can be seen from the description of FIG. 2, the electronic device including the IMU data acquisition device is used as an example for description. For example, the electronic device is a mobile phone, a remote control device (such as a remote control, a handle, etc.), a tablet computer, a wearable device (such as smart watches, smart bracelets, etc.)
具体地,在步骤S201中,当用户通过手持或穿戴的方式携带具备IMU数据采集装置的电子设备,并且在执行空鼠操作的过程中,IMU持续追踪用户手势变化,通过IMU所包含的主要元件陀螺仪、加速度计和磁力计等记录IMU数据,在第一时刻集合内采集用户肢体动作得到初始IMU数据,使得电子设备可以在步骤S201中获取并确定该初始IMU数据。Specifically, in step S201, when the user carries the electronic device equipped with the IMU data collection device by hand or wears it, and in the process of performing the air mouse operation, the IMU continuously tracks the change of the user's gesture, through the main components included in the IMU The gyroscope, accelerometer, magnetometer, etc. record the IMU data, and collect the user's limb movements in the first time set to obtain the initial IMU data, so that the electronic device can obtain and determine the initial IMU data in step S201.
可选地,在步骤S201中获取得到的该初始IMU数据可以是经过波形平滑处理、去燥校准补偿处理,或者是其它的方式对该IMU数据采集装置所记录的IMU数据进行处理之后, 得到的初始IMU数据,此处不做限定。Optionally, the initial IMU data obtained in step S201 may be obtained after processing the IMU data recorded by the IMU data acquisition device through waveform smoothing processing, de-drying calibration compensation processing, or other methods. Initial IMU data, not limited here.
S202.电子设备确定第一图像数据。S202. The electronic device determines the first image data.
本实施例中,当用户通过手持或穿戴的方式携带具备IMU数据采集装置的电子设备,且该用户处于图像采集装置的拍摄区域时,图像采集装置在第二时刻集合内采集用户肢体动作得到图像数据,该图像数据可以包括一帧或多帧图像信息、或者包含有多帧图像的视频流等。其中,电子设备在步骤S202中,可以与包含有该图像采集装置的设备之间,通过有线/无线的通信连接获取得到该第一图像数据。In this embodiment, when a user carries an electronic device equipped with an IMU data acquisition device in a handheld or wearable manner, and the user is in the shooting area of the image acquisition device, the image acquisition device collects the user's limb movements in the second set of moments to obtain an image The image data may include one or more frames of image information, or a video stream containing multiple frames of images, and the like. Wherein, in step S202, the electronic device may obtain the first image data through wired/wireless communication connection with the device including the image acquisition device.
需要说明的是,第二时刻集合包含于第一时间段,例如图像采集装置的采样频率为第二采样频率(例如5Hz),则第二时刻集合所包含的多个时刻为第一时间段内每一秒钟内的5个时刻。It should be noted that the second time set is included in the first time period. For example, the sampling frequency of the image acquisition device is the second sampling frequency (for example, 5 Hz), and the multiple moments included in the second time set are within the first time period. 5 moments in every second.
具体地,如前述实时校准的人机交互方式(即方式三)的描述内容可知,受限于硬件计算能力限制,CV识别过程的计算时长一般远远大于IMU数据的处理时长,例如,CV识别的每次计算过程一般需要几百毫秒,而IMU数据的每次处理时长一般为几毫秒或十几毫秒,两者相差至少一个数量级。因此,图像采集装置在步骤S202所采集图像数据对应的第二时刻集合可以为IMU数据采集装置在步骤S201所采集的初始IMU数据对应的第一时刻集合的子集。Specifically, as can be seen from the description of the human-computer interaction method (ie, method 3) of the aforementioned real-time calibration, due to the limitation of hardware computing capabilities, the calculation time of the CV recognition process is generally much longer than the processing time of the IMU data. For example, CV recognition Each calculation process of the IMU generally takes several hundred milliseconds, while the processing time of each IMU data is generally several milliseconds or ten milliseconds, and the difference between the two is at least an order of magnitude. Therefore, the second time set corresponding to the image data collected by the image acquisition device in step S202 may be a subset of the first time set corresponding to the initial IMU data collected by the IMU data acquisition device in step S201.
示例性的,此处以IMU数据每次处理时长为10毫秒,且CV识别过程每次处理时长为200毫秒为例。在步骤S201和步骤S202中,用户通过手持或穿戴的方式携带具备IMU数据采集装置的设备,且该用户处于图像采集装置的拍摄区域进行空鼠操作的某一秒钟的时长区间内,该时长区间记为(0,1000],此处描述及后续描述的时长区间单位均为毫秒。在该示例中,步骤S201中采集初始IMU数据的第一时刻集合为(10,20,30...200,210,220...400...600...800...980,990,1000)共100个时刻,步骤S202中采集第一图像数据的第二时刻集合为(200,400,600,800,1000)共5个时刻。后续在步骤S203和步骤S204中可以基于5个时刻采集得到的第一图像数据,对基于100个时刻采集得到的初始IMU数据进行异步校准,以避免仅使用IMU数据进行人机交互方式时存在的光标漂移等问题(参见前述方式一的描述)。Exemplarily, the processing time of each IMU data is 10 milliseconds, and the processing time of each CV identification process is 200 milliseconds as an example. In steps S201 and S202, the user carries the device equipped with the IMU data acquisition device by hand or wearing, and the user is within a certain second time interval during which the air mouse operation is performed in the shooting area of the image acquisition device. The interval is denoted as (0, 1000], and the unit of time interval described here and in the subsequent description is milliseconds. In this example, the first time set for collecting initial IMU data in step S201 is (10, 20, 30... 200, 210, 220... 400... 600... 800... 980, 990, 1000) a total of 100 times, the second time set for collecting the first image data in step S202 is (200, 400, 600, 800, 1000) a total of 5 times. Subsequent In step S203 and step S204, the initial IMU data collected based on 100 times may be asynchronously calibrated based on the first image data collected at 5 times, so as to avoid cursors that exist when only the IMU data is used for human-computer interaction. Drift and other issues (see the description of the above-mentioned way 1).
具体地,在步骤S202的实现过程中,图像采集装置可以是检测到用户手持或佩戴IMU数据采集装置且用户的位置处于图像采集装置的拍摄区域时,或者是,图像采集装置响应于用户的语音唤醒,或者是,图像采集装置响应于用户在电子设备上的操作,执行采集图像数据的过程,此处不做限定。Specifically, in the implementation process of step S202, the image acquisition device may detect that the user holds or wears the IMU data acquisition device and the user's position is in the shooting area of the image acquisition device, or the image acquisition device responds to the user's voice Wake up, or, in response to a user's operation on the electronic device, the image acquisition apparatus performs a process of acquiring image data, which is not limited here.
此外,图像采集装置所包含的摄像头的数量可以设置为一个,即图像采集装置通过单个摄像头采集得到图像数据,或者,图像采集装置所包含摄像头的数量也可以设置为多个,即图像采集装置通过多个摄像头采集得到图像数据,此处不做限定。示例性的,当图像采集装置所包含的摄像头为多个时,可以通过不同的摄像头来覆盖不同范围的场景,并解决了摄像头无法来回切换焦距的问题,还可以便于通过多个摄像头所采集得到的多组图像数据进行测距以提高测距准确度。当图像采集装置所包含的摄像头为一个时,相较于多个摄像头的布置方式,可以节省摄像头的硬件设置,并减小图像数据的计算量以提升后续光标 响应速度。In addition, the number of cameras included in the image capture device may be set to one, that is, the image capture device acquires image data through a single camera, or the number of cameras included in the image capture device may also be set to multiple, that is, the image capture device uses Image data is acquired by multiple cameras, which is not limited here. Exemplarily, when the image acquisition device includes multiple cameras, different cameras can be used to cover scenes of different ranges, and the problem that the cameras cannot switch the focal length back and forth can be solved. ranging from multiple sets of image data to improve ranging accuracy. When the image acquisition device includes one camera, compared with the arrangement of multiple cameras, the hardware settings of the camera can be saved, and the calculation amount of image data can be reduced to improve the subsequent cursor response speed.
本实施例及后续实施例中仅以该摄像头的数量为一个作为示例进行说明。请参阅图7,为电子设备中的图像采集装置与显示装置之间位置关系设置的实现示例。其中,图像采集装置可以设置于显示装置的显示区域之外,例如,图7的(a)所示将图像采集装置设置于靠近显示装置上边框的位置,又如,图7的(b)所示将图像采集装置设置于靠近显示装置下边框的位置,或者是将图像采集装置设置在显示装置的其它位置,例如靠近显示装置左边框的位置,靠近显示装置右边框的位置,或者是靠近显示装置左上角、右上角、左下角、右下角的位置等,此处不做限定。此外,图像采集装置也可以设置于显示装置的显示区域之内,例如,图7的(c)所示将图像采集装置设置于显示装置显示区域中靠近上边框的位置,又如,图7的(d)所示将图像采集装置设置于显示装置显示区域的中间位置,或者是将图像采集装置设置在显示装置的其它位置,例如显示装置显示区域中靠近其它边框的位置,此处不做限定。In this embodiment and subsequent embodiments, only one camera is taken as an example for description. Please refer to FIG. 7 , which is an implementation example of setting the positional relationship between the image capturing device and the display device in the electronic device. The image capture device may be set outside the display area of the display device. For example, as shown in (a) of FIG. 7 , the image capture device is set at a position close to the upper frame of the display device. For another example, as shown in (b) of FIG. 7 It is shown that the image acquisition device is set at a position close to the lower border of the display device, or the image acquisition device is set at other positions of the display device, such as a position close to the left border of the display device, a position close to the right border of the display device, or a position close to the display device. The positions of the upper left corner, upper right corner, lower left corner, and lower right corner of the device are not limited here. In addition, the image capturing device may also be arranged within the display area of the display device. For example, as shown in (c) of FIG. As shown in (d), the image acquisition device is set at the middle position of the display area of the display device, or the image acquisition device is set at other positions of the display device, such as a position close to other borders in the display area of the display device, which is not limited here. .
S203.电子设备根据所述第一图像数据进行CV关键点识别,得到第一约束条件;S203. The electronic device performs CV key point recognition according to the first image data to obtain a first constraint condition;
本实施例中,电子设备根据步骤S202中确定的第一图像数据进行CV关键点识别,得到第一约束条件。In this embodiment, the electronic device performs CV key point recognition according to the first image data determined in step S202 to obtain the first constraint condition.
具体地,在步骤S203中,电子设备通过读取第一图像数据,采用人体骨架识别技术对图像数据中所包含的人体进行CV关键点识别,并将得到的识别结果确定为该第一约束条件,例如该CV识别结果可以包括CV关键点的三维空间方位角信息。其中,CV关键点识别定位过程中,可以基于CV关键点个数为9、CV关键点个数为14、CV关键点个数为16、CV关键点个数为21,或者是其它的CV关键点个数的实现方式,此处不做限定。示例性的,以图8所示CV关键点个数为9的实现过程作为示例,其中,9个CV关键点分别包括用户的左眼、鼻、右眼、左肩、右肩、左肘、右肘、左腕、右腕。在步骤S203中,该CV关键点至少包括用户手持或佩戴包含有该IMU数据采集装置的电子设备的位置,例如用户的左肘、左腕、右肘、右腕、左手手指、右手手指等或者是其它的CV关键点,可以根据具体的应用场景进行调整,此处不做限定。Specifically, in step S203, by reading the first image data, the electronic device uses the human body skeleton recognition technology to perform CV key point recognition on the human body included in the image data, and determines the obtained recognition result as the first constraint condition , for example, the CV identification result may include the three-dimensional space azimuth information of the CV key points. Among them, in the process of CV key point identification and positioning, it can be based on the number of CV key points being 9, the number of CV key points being 14, the number of CV key points being 16, the number of CV key points being 21, or other CV key points. The implementation of the number of points is not limited here. Illustratively, take the implementation process in which the number of CV key points is 9 as shown in FIG. 8 as an example, wherein the 9 CV key points respectively include the user's left eye, nose, right eye, left shoulder, right shoulder, left elbow, right Elbow, left wrist, right wrist. In step S203, the CV key point at least includes the position where the user holds or wears the electronic device containing the IMU data acquisition device, such as the user's left elbow, left wrist, right elbow, right wrist, left hand finger, right finger, etc. or other The CV key points of , can be adjusted according to specific application scenarios, which are not limited here.
需要说明的是,在步骤S203中,该CV关键点识别过程也可以是包含有图像采集装置的设备执行以得到第一约束条件,即在步骤S202中,包含有图像采集装置的设备也可以将该该第一约束条件发送至电子设备。在该场景下,电子设备无需执行CV关键点识别过程,可以降低该电子设备的处理时延,提升该电子设备的响应速度。It should be noted that, in step S203, the CV key point identification process may also be performed by a device including an image capture device to obtain the first constraint condition, that is, in step S202, the device including an image capture device may also The first constraint condition is sent to the electronic device. In this scenario, the electronic device does not need to perform the CV key point identification process, which can reduce the processing delay of the electronic device and improve the response speed of the electronic device.
作为一种实现示例,无论是电子设备执行CV关键点识别过程,还是包含有图像采集装置的设备执行CV关键点识别过程,都可以是通过预设的神经网络模型对输入的图像数据(例如前述第一图像数据)进行处理,得到第一约束条件。其中,通过神经网络模型的处理过程,可以大大提升处理效率,进一步提升后续光标在显示装置中的响应速度。As an implementation example, whether the electronic device performs the CV key point recognition process, or the device including the image acquisition device performs the CV key point recognition process, the input image data (for example, the aforementioned The first image data) is processed to obtain the first constraint condition. Among them, through the processing process of the neural network model, the processing efficiency can be greatly improved, and the response speed of the subsequent cursor in the display device can be further improved.
可选地,该预设的神经网络模型可以是通过对训练样本进行训练得到,其中,该训练样本可以包括图像数据与标签数据,该标签数据可以为图像数据对应的CV关键点坐标,或者,该标签数据可以为图像数据对应的约束条件(例如三维空间方向角),或者,该标签数据可以为图像数据对应的CV关键点坐标和图像数据对应的约束条件(例如三维空间方向 角),此处不做限定。此外,该训练过程可以是电子设备在本地执行,或者是包含有图像采集装置的设备在本地执行,或者是云端服务器执行再通过数据传输的方式传输至电子设备或包含有图像采集装置的设备,此处不做限定。Optionally, the preset neural network model may be obtained by training a training sample, wherein the training sample may include image data and label data, and the label data may be the CV key point coordinates corresponding to the image data, or, The label data may be a constraint condition corresponding to the image data (such as a three-dimensional space orientation angle), or the label data may be a CV key point coordinate corresponding to the image data and a constraint condition (such as a three-dimensional space orientation angle) corresponding to the image data. There are no restrictions. In addition, the training process can be performed locally by the electronic device, or locally by the device including the image capture device, or by the cloud server and then transmitted to the electronic device or the device including the image capture device by means of data transmission, There is no limitation here.
此处仍以前述步骤S202中用户通过手持或穿戴的方式携带具备IMU数据采集装置的设备,且该用户处于图像采集装置的拍摄区域进行空鼠操作的某一秒钟的时长区间内的实现过程作为示例进行描述。在该示例中,步骤S202中图像采集装置采集第一图像数据的第二时刻集合为(200,400,600,800,1000)共5个时刻,电子设备在步骤S203中分别在这五个时刻对应的图像数据进行CV关键点识别(例如用户在右腕佩戴包含有IMU数据采集装置的手表),得到“右腕”这一CV关键点在5个时刻中的定位坐标,这5个时刻中的定位坐标分别指示用户的移动或操作所对应的右腕的定位坐标,并按照5个时刻的时间先后顺序确定右腕的移动方向,以确定右腕(或右腕所在的手臂)三维空间方位角,并将该三维空间方向角确定为该第一约束条件。Here, in the aforementioned step S202, the user carries the equipment with the IMU data acquisition device by hand-held or wearing, and the user is in the shooting area of the image acquisition device and performs the air mouse operation within a certain second. The realization process Described as an example. In this example, in step S202, the set of second moments when the image acquisition device collects the first image data is (200, 400, 600, 800, 1000) a total of 5 moments, and the electronic device performs CV on the image data corresponding to these five moments in step S203. Key point identification (for example, the user wears a watch containing an IMU data acquisition device on the right wrist), and obtains the positioning coordinates of the CV key point "right wrist" in five moments, and the positioning coordinates in these five moments respectively indicate the user's movement. Or the positioning coordinates of the right wrist corresponding to the operation, and determine the movement direction of the right wrist according to the chronological order of 5 moments to determine the azimuth angle in three-dimensional space of the right wrist (or the arm where the right wrist is located), and determine the three-dimensional space direction angle as the first constraint.
此外,在步骤S203中,电子设备所使用的人体骨架识别技术可以是采用三维(3-dimensional,3D)人体骨架识别技术进行CV关键点识别(例如步骤S202中图像采集装置是通过单目摄像头采集图像时),也可以是采用二维(2-dimensional,2D)人体骨架识别技术(例如步骤S202中图像采集装置是通过多目摄像头采集图像时),此处不做限定。即,电子设备在步骤S203中用于确定第一约束条件的CV关键点可以是2D人体CV关键点,也可以是3D人体CV关键点,此处不做限定。In addition, in step S203, the human skeleton recognition technology used by the electronic device may be CV key point recognition using a three-dimensional (3-dimensional, 3D) human skeleton recognition technology (for example, in step S202, the image acquisition device uses a monocular camera to acquire image), or a two-dimensional (2-dimensional, 2D) human skeleton recognition technology (for example, when the image acquisition device in step S202 acquires images through a multi-camera camera), which is not limited here. That is, the CV key points used by the electronic device to determine the first constraint condition in step S203 may be 2D human body CV key points or 3D human body CV key points, which are not limited here.
可选地,电子设备可以获取包含IMU数据采集装置的设备的佩戴位置信息,从而电子设备可以确定多个CV关键点中哪个CV关键点可以作为第一约束条件。例如,假设包含IMU数据采集装置的设备为手表、电子设备也为手表。用户将手表佩戴在右腕,手表可以感知到用户将手表佩戴在左腕还是右腕。假设CV关键点包括左肩、右肩、左肘、右肘、左腕、右腕共6个CV关键点,手表可以根据佩戴位置信息确定将右腕这一CV关键点的三维空间方位角信息作为第一约束条件。Optionally, the electronic device can acquire the wearing position information of the device including the IMU data acquisition device, so that the electronic device can determine which CV key point among the multiple CV key points can be used as the first constraint condition. For example, it is assumed that the device including the IMU data acquisition device is a watch, and the electronic device is also a watch. The user wears the watch on the right wrist, and the watch can sense whether the user wears the watch on the left wrist or the right wrist. Assuming that the CV key points include 6 CV key points including left shoulder, right shoulder, left elbow, right elbow, left wrist and right wrist, the watch can determine the 3D space azimuth information of the CV key point of the right wrist as the first constraint according to the wearing position information condition.
S204.电子设备基于所述第一约束条件校准所述IMU数据,得到目标IMU数据;S204. The electronic device calibrates the IMU data based on the first constraint condition to obtain target IMU data;
本实施例中,电子设备根据步骤S203中确定得到的第一约束条件对步骤S201中确定的IMU数据进行校准,得到目标IMU数据。In this embodiment, the electronic device calibrates the IMU data determined in step S201 according to the first constraint condition determined in step S203 to obtain target IMU data.
具体地,电子设备将步骤S201中确定的初始IMU数据中陀螺仪、加速度计和磁力计三个传感器记录的IMU数据,经过姿态解算算法处理得到姿态角度信息,并基于步骤S203中得到的第一约束条件对该记录的IMU数据进行校准(或者是,基于步骤S203中得到的第一约束条件对该姿态角度信息进行校准,或者是,基于步骤S203中得到的第一约束条件对该记录的IMU数据和该姿态角度信息进行校准),得到目标IMU数据。其中,姿态解算算法可以包括mahony算法、卡尔曼滤波算法等。Specifically, the electronic device processes the IMU data recorded by the three sensors of the gyroscope, accelerometer and magnetometer in the initial IMU data determined in step S201 to obtain the attitude angle information through the attitude calculation algorithm, and based on the first obtained in step S203 A constraint condition is used to calibrate the recorded IMU data (or the attitude angle information is calibrated based on the first constraint condition obtained in step S203, or the recorded data is calibrated based on the first constraint condition obtained in step S203. The IMU data and the attitude angle information are calibrated) to obtain the target IMU data. The attitude calculation algorithm may include a mahony algorithm, a Kalman filter algorithm, and the like.
此处仍以前述步骤S202中用户通过手持或穿戴的方式携带具备IMU数据采集装置的设备,且该用户处于图像采集装置的拍摄区域进行空鼠操作的某一秒钟的时长区间内的实现过程作为示例进行描述。在该示例中,步骤S201中IMU数据采集装置所采集的初始IMU数据的第一时刻集合为(10,20,30...200,210,220...400...600...800...980,990,1000)共 100个时刻,步骤S202中图像采集装置所采集第一图像数据的第二时刻集合为(200,400,600,800,1000)共5个时刻,电子设备在步骤S203中分别在这五个时刻对应的图像数据进行CV关键点识别(例如用户在右腕佩戴包含有IMU数据采集装置的手表),得到包含有这5个时刻中的定位坐标的第一约束条件,并根据这5个时刻中的定位坐标分别在第一时刻集合内进行异步校准,即根据5个时刻中的定位坐标对第一时刻集合中(200,400,600,800,1000)这五个时刻的IMU数据对应的姿态角度信息进行校准,得到校准后的100个时刻的IMU数据,即目标IMU数据。Here, in the aforementioned step S202, the user carries the equipment with the IMU data acquisition device by hand-held or wearing, and the user is in the shooting area of the image acquisition device and performs the air mouse operation within a certain second. The realization process Described as an example. In this example, the first time set of the initial IMU data collected by the IMU data collection device in step S201 is (10, 20, 30...200, 210, 220...400...600...800...980,990 , 1000) a total of 100 moments, the second moment set of the first image data collected by the image acquisition device in step S202 is (200, 400, 600, 800, 1000) a total of 5 moments, the electronic device in step S203 corresponds to the five moments respectively The image data is used for CV key point recognition (for example, the user wears a watch containing an IMU data acquisition device on the right wrist), and the first constraint condition containing the positioning coordinates in these five moments is obtained, and according to the positioning coordinates in the five moments The asynchronous calibration is performed in the first time set respectively, that is, the attitude angle information corresponding to the IMU data in the first time set (200, 400, 600, 800, 1000) is calibrated according to the positioning coordinates in the five time sets, and the calibrated The IMU data of 100 moments is the target IMU data.
具体地,在上述示例中,第一约束条件为5个时刻中的定位坐标所指示的手臂三维空间方向角,初始IMU数据为100个时刻的IMU数据。即电子设备在步骤S204中,可以是将(第一约束条件中的)5个时刻中的定位坐标所指示的手臂三维空间方向角对(初始IMU数据中的)100个时刻的IMU数据进行校准,得到目标IMU数据。其中,通过少量时刻的图像数据的CV识别所形成的第一约束条件对较多时刻的初始IMU数据进行异步校准,相比于(例如前述方式三中)同步校准方式,无需等待CV识别较长的处理时长的,后续可以使用校准得到的目标IMU数据得到的控制信息在显示装置中控制光标,使得显示装置中的光标的刷新率可以与IMU数据采集的帧率相同,而非同步校准方式中受限于CV识别的处理频率,可以提升光标的刷新频率的同时,避免光标显示卡顿、显示延迟等问题,提升用户体验。Specifically, in the above example, the first constraint condition is the three-dimensional space direction angle of the arm indicated by the positioning coordinates in 5 times, and the initial IMU data is the IMU data at 100 times. That is, in step S204, the electronic device may calibrate the three-dimensional space direction angle of the arm indicated by the positioning coordinates in 5 moments (in the first constraint) to the IMU data at 100 moments (in the initial IMU data) , get the target IMU data. Among them, asynchronous calibration is performed on the initial IMU data at many times through the first constraint formed by the CV identification of the image data at a small amount of time. Compared with the synchronous calibration method (for example, in the aforementioned method 3), there is no need to wait for a long time for CV identification. The control information obtained from the target IMU data obtained by calibration can be used to control the cursor in the display device, so that the refresh rate of the cursor in the display device can be the same as the frame rate of the IMU data collection, and in the asynchronous calibration method Limited by the processing frequency of CV recognition, the refresh frequency of the cursor can be increased, and at the same time, problems such as cursor display freeze and display delay can be avoided, and the user experience can be improved.
作为该校准处理的一种实现示例,电子设备在步骤S204中可以是根据第一约束条件对初始IMU数据中传感器记录的IMU数据进行校准,得到校准结果,然后经过姿态结算算法对校准结果进行处理,并将处理得到的姿态角度信息作为目标IMU数据。As an implementation example of the calibration process, the electronic device may calibrate the IMU data recorded by the sensor in the initial IMU data according to the first constraint condition in step S204 to obtain the calibration result, and then process the calibration result through the attitude settlement algorithm , and use the attitude angle information obtained by processing as the target IMU data.
具体地,电子设备根据第一约束条件中的(5个时刻的)手臂三维空间方向角进行映射处理,得到每一个手臂三维空间方向角对应的(5个时刻的)IMU校准数据,并基于得到的(5个时刻的)IMU校准数据拟合处理得到IMU校准曲线。并且,根据初始IMU数据中传感器记录的(100个时刻的)IMU数据进行拟合处理,得到初始IMU曲线,进一步地,对该IMU校准曲线和初始IMU曲线进行加权平均处理,得到优化后的曲线,并从优化后的曲线读取得到对应的(100个时刻的)校准后的IMU数据。再进一步对校准后的IMU数据经过姿态解算算法处理得到(100个时刻的)姿态角度信息,并将该(100个时刻的)姿态角度信息作为步骤S204中得到的目标IMU数据。Specifically, the electronic device performs mapping processing according to the three-dimensional space orientation angle of the arm (at five moments) in the first constraint condition, and obtains the IMU calibration data corresponding to each arm three-dimensional space orientation angle (at five moments), and based on the obtained The (5 time) IMU calibration data are fitted to obtain the IMU calibration curve. In addition, according to the IMU data recorded by the sensor in the initial IMU data (100 moments), the initial IMU curve is obtained by fitting, and further, the weighted average processing is performed on the IMU calibration curve and the initial IMU curve to obtain the optimized curve. , and read the corresponding (100 time) calibrated IMU data from the optimized curve. Further, the calibrated IMU data is processed by an attitude calculation algorithm to obtain attitude angle information (at 100 times), and the attitude angle information (at 100 times) is used as the target IMU data obtained in step S204.
作为该校准处理的另一种实现示例,电子设备在步骤S204中,可以是首先根据初始IMU数据中传感器记录的IMU数据经过姿态解算算法处理得到姿态角度信息,再根据该第一约束条件对姿态角度信息进行校准处理,并将校准后的姿态角度信息作为目标IMU数据。As another implementation example of the calibration process, in step S204, the electronic device may first obtain the attitude angle information according to the IMU data recorded by the sensor in the initial IMU data through the attitude calculation algorithm, and then adjust the attitude angle information according to the first constraint condition. The attitude angle information is calibrated, and the calibrated attitude angle information is used as the target IMU data.
具体地,电子设备根据第一约束条件中的(5个时刻的)手臂三维空间方向角进行回归处理,得到每一个手臂三维空间方向角对应的(5个时刻的)姿态角度信息,并基于得到的(5个时刻的)姿态角度信息拟合处理得到姿态角度校准曲线。并且,根据初始IMU数据中传感器记录的(100个时刻的)IMU数据,经过姿态解算算法处理得到(5个时刻的)姿态角度信息,并基于(5个时刻的)姿态角度信息拟合处理,得到姿态角度变化曲线。此后,对该姿态角度校准曲线和姿态角度变化曲线进行加权平均处理/滤波处理,得到优化 后的姿态角度曲线,并基于优化后的姿态角度曲线得到(100个时刻的)姿态角度信息作为步骤S204中得到的目标IMU数据。Specifically, the electronic device performs regression processing according to the three-dimensional space orientation angle of the arm (at five moments) in the first constraint condition, and obtains the attitude angle information corresponding to each arm three-dimensional space orientation angle (at five moments), and based on the obtained The attitude angle calibration curve is obtained by fitting the attitude angle information of (5 moments). And, according to the IMU data (100 moments) recorded by the sensor in the initial IMU data, the attitude angle information (5 moments) is obtained through the attitude calculation algorithm, and the attitude angle information (5 moments) is fitted and processed. , get the attitude angle change curve. Thereafter, weighted average processing/filtering is performed on the attitude angle calibration curve and attitude angle change curve to obtain an optimized attitude angle curve, and based on the optimized attitude angle curve, attitude angle information (of 100 moments) is obtained as step S204 The target IMU data obtained in .
以前述方式三的场景讨论,用户站在显示屏面前,IMU和CV多设备融合实现实时的人机交互方法的过程中,还容易存在定位不准、光标溢出等问题。示例性的,此处以图9所示实现过程为例,说明方式三的实现过程存在的定位不准、光标溢出问题。As discussed in the scenario of method 3 above, when the user stands in front of the display screen, and the IMU and CV multi-devices are integrated to realize the real-time human-computer interaction method, there are still problems such as inaccurate positioning and cursor overflow. Exemplarily, the implementation process shown in FIG. 9 is used as an example here to illustrate the problems of inaccurate positioning and cursor overflow in the implementation process of the third mode.
如图9所示,用户手部携带可穿戴设备(包括IMU),在带有显示屏和摄像头的设备前面执行用户肢体动作,该可穿戴设备获取该用户肢体动作所产生的IMU数据,同时,摄像头获取该用户肢体动作所产生的图像数据,并根据该图像数据对IMU数据进行校准,以实现对显示屏上的坐标位置进行空鼠移动的操作的场景。其中,以虚线箭头指示用户肢体动作所触发的可穿戴设备的移动方向,以实线箭头指示显示屏中坐标位置的坐标位移。其中,由于显示屏上的摄像头位置为固定的(例如设置在显示屏的中间轴线的上方时),当用户处于位置0时,用户肢体动作移动某一角度范围,使得在显示屏上映射的坐标从A移动至B;当用户处于位置1时,用户肢体动作移动相同角度范围,使得在显示屏上映射的坐标从C移动至D,但是由于摄像头与用户的相对方向发生改变,导致用户肢体动作移动即使移动相同角度范围,所带来的显示屏上映射的坐标位移变化仍然会不同(即AB两点距离与CD两点距离不等);类似的,当用户在位置2时,用户肢体动作移动相同角度范围,使得在显示屏上映射的坐标从E移动至F,此时,在位置2中摄像头与用户的相对方向与在位置0中摄像头与用户的相对方向类似,所带来的显示屏上映射的坐标位移变化有可能相同(即CD两点距离与EF两点距离近似相等),但是由于位置2靠近显示屏右边缘,因此容易产生如图所示光标溢出的问题(即F点的坐标超出显示屏显示区域所覆盖的坐标范围)。As shown in Figure 9, the user carries a wearable device (including an IMU) on his hand, and performs user body movements in front of the device with a display screen and a camera. The wearable device obtains the IMU data generated by the user's body movements, and at the same time, The camera acquires the image data generated by the user's body movements, and calibrates the IMU data according to the image data, so as to realize the scene of the mouse movement operation on the coordinate position on the display screen. Wherein, the moving direction of the wearable device triggered by the user's physical action is indicated by a dashed arrow, and the coordinate displacement of the coordinate position in the display screen is indicated by a solid arrow. Among them, since the position of the camera on the display screen is fixed (for example, when it is set above the middle axis of the display screen), when the user is at position 0, the user's limb movements move within a certain angular range, so that the coordinates mapped on the display screen Move from A to B; when the user is in position 1, the user's body movement moves the same angle range, so that the coordinates mapped on the display screen move from C to D, but because the relative direction of the camera and the user changes, the user's body movement Even if the movement moves within the same angular range, the changes of the coordinates mapped on the display will still be different (that is, the distance between the two points AB is not equal to the distance between the two points CD); similarly, when the user is at position 2, the user's limbs move Move the same angle range, so that the coordinates mapped on the display screen move from E to F. At this time, the relative direction between the camera and the user in position 2 is similar to the relative direction between the camera and the user in position 0, and the resulting display The coordinate displacement changes mapped on the screen may be the same (that is, the distance between the two points CD is approximately equal to the distance between the two points EF), but since the position 2 is close to the right edge of the display screen, it is easy to cause the problem of cursor overflow as shown in the figure (that is, point F). The coordinates are beyond the range of coordinates covered by the display area).
此外,当用户朝向发生变化时,用户肢体动作移动相同角度范围时,该用户肢体动作在显示屏上对应显示的光标移动路径也会存在偏差。例如,当用户分别处于身体正面朝向摄像头和身体侧面朝向摄像头的站姿时,用户肢体即使作出相同的动作,所引起的显示屏上的光标移动路径也是不同的。具体例如,当用户处于身体正面朝向摄像头的站姿时,用户手臂以肩关节为轴,从自然垂于身体一侧的位置移动到与躯干处于同一平面的平行于地面的位置,所引起的显示屏上的光标移动路径可能是一条圆弧。而当用户处于身体侧面朝向摄像头的站姿时,手臂仍作上述动作,则所引起的显示屏上的光标移动路径可能是一条直线。即,当人体相对显示屏站位、朝向变化时,由于无法追踪到空间位移和人体躯干坐标系的角度变化,从而引起定位不准、光标溢出问题。为了解决该问题,在步骤S204中可以进一步优化实现,下面将详细介绍。In addition, when the user's orientation changes, and the user's body movement moves within the same angular range, the cursor movement path corresponding to the user's body movement displayed on the display screen will also deviate. For example, when the user is in a standing posture with the front of the body facing the camera and the side of the body facing the camera, even if the user's limbs perform the same action, the resulting movement paths of the cursor on the display screen are different. Specifically, for example, when the user is in a standing posture with the front of the body facing the camera, the user's arm moves from a position naturally perpendicular to one side of the body to a position parallel to the ground on the same plane as the torso using the shoulder joint as the axis. The cursor movement path on the screen may be an arc. However, when the user is in a standing posture with the side of the body facing the camera, and the arm still performs the above actions, the resulting movement path of the cursor on the display screen may be a straight line. That is, when the position and orientation of the human body relative to the display screen changes, the spatial displacement and the angle change of the human body torso coordinate system cannot be tracked, resulting in inaccurate positioning and cursor overflow. In order to solve this problem, the implementation can be further optimized in step S204, which will be described in detail below.
在一种可能实现方式中,电子设备在步骤S204中,基于该第一约束条件校准该初始IMU数据,得到目标IMU数据的过程具体可以包括:电子设备首先根据步骤S202中得到的第一图像数据确定第一人体手臂工学模型,该第一人体手臂工学模型包括至少一个肢体转动方向的第一值域,此后,基于该第一人体手臂工学模型对第一约束条件进行更新,得到更新后的第一约束条件,再进一步使用更新后的第一约束条件校准该初始IMU数据,得到目标IMU数据。In a possible implementation manner, in step S204, the electronic device calibrates the initial IMU data based on the first constraint condition, and the process of obtaining the target IMU data may specifically include: the electronic device firstly obtains the first image data according to the first image data obtained in step S202. Determine a first human arm engineering model, the first human arm engineering model includes at least one first value range of the rotation direction of the limb, after that, the first constraint condition is updated based on the first human arm engineering model to obtain the updated No. A constraint condition, and further use the updated first constraint condition to calibrate the initial IMU data to obtain target IMU data.
具体地,该电子设备还可以根据步骤S202所得到的第一图像数据确定第一人体手臂工 学模型,该第一人体手臂工学模型包括至少一个肢体转动方向的第一值域,即针对用户个性化易疲倦特性,构造最小功最小扭矩变化的第一人体手臂工学模型。示例性的,用户在进行人机交互的过程中,用户一般不会做出违背人体工学的肢体动作,例如用户的腕关节的伸屈范围可以为[-35,50°]、尺偏范围可以为[-25,30°],如果检测到超出这个范围的腕关节活动的话,可以视为IMU数据采集装置或图像采集装置的识别有误,因此,可以针对用户的不同肢体的可活动值域构建该人体手臂工学模型。Specifically, the electronic device can also determine a first human arm engineering model according to the first image data obtained in step S202, where the first human arm engineering model includes at least one first value range of the rotation direction of the limb, that is, for the user's personalization It is easy to get tired and constructs the first human arm engineering model with minimum work and minimum torque change. Exemplarily, in the process of human-computer interaction, the user generally does not make limb movements that violate ergonomics. is [-25, 30°]. If the wrist joint movement beyond this range is detected, it can be considered that the identification of the IMU data acquisition device or image acquisition device is incorrect. Therefore, the movable range of different limbs of the user can be determined. Build the ergonomic model of the human arm.
需要说明的是,用户的易疲倦信可以是指示大猩猩臂效应,例如,用户的肘关节相比较于肩关节有更大的角度变化,可能前几分钟肩关节的移动角度为短暂的较大区间,但是很快肩关节的运动角度缩小为30度以内。最小功和最小扭矩变化也是特指这个现象,即位于刚性柱体尾部范围的移动比根部范围的移动相同角度耗费的功和扭矩都是较小的。It should be noted that the user's fatigue letter may indicate the gorilla arm effect. For example, the user's elbow joint has a larger angle change than the shoulder joint, and the movement angle of the shoulder joint may be temporarily larger in the first few minutes. range, but soon the movement angle of the shoulder joint is reduced to within 30 degrees. The minimum work and minimum torque changes are also specific to the phenomenon that the movement at the end of the rigid cylinder consumes less work and torque than the movement at the root of the cylinder by the same angle.
此后,基于该第一人体手臂工学模型和该第一约束条件确定该目标IMU数据。在具体的实现过程中,可以将步骤S203确定得到的第一约束条件(例如前述三维空间方向角)输入至人体手臂工学模型,形成关于姿态角的三元不等式,并基于该姿态角的三元不等式校准初始IMU数据,得到步骤S204中的目标IMU数据。其中,将人体手臂工学模型作为约束条件之一,可以有效规避在显示屏中的定位不准、光标溢出等情况。Thereafter, the target IMU data is determined based on the first ergonomic model of the human arm and the first constraint. In the specific implementation process, the first constraint condition determined in step S203 (for example, the aforementioned three-dimensional space direction angle) can be input into the human arm engineering model to form a ternary inequality about the attitude angle, and based on the attitude angle of the ternary The inequality calibrates the initial IMU data to obtain the target IMU data in step S204. Among them, the human arm engineering model is used as one of the constraints, which can effectively avoid the inaccurate positioning in the display screen and the overflow of the cursor.
示例性的,人体手臂工学模型的一种实现示例可以通过图10实现。如图10所示,根据力学、动力学原理分析,将用户肢体模拟为四轴刚性柱体的人体手臂工学模型,例如,图中θ1指示用户的肩膀、θ2指示用户的肩关节、θ3指示用户的肘关节、θ4指示用户的腕关节。具体地,每个用户肢体对应的每个刚性柱体在不同自由度上的可移动距离/角度不同,该第一人体手臂工学模型包括至少一个肢体转动方向的第一值域。其中,用户的不同肢体的自由度可以示例性地表示为俯仰(pitch)、偏航(roll)、滚转(yaw),一般的,用户的不同肢体的肢体动作可以基于这三个自由度进行表示。Illustratively, an implementation example of an ergonomic model of a human arm can be implemented in FIG. 10 . As shown in Figure 10, according to the analysis of mechanics and dynamics principles, the user's limb is simulated as a four-axis rigid cylinder human arm engineering model. For example, in the figure, θ1 indicates the user's shoulder, θ2 indicates the user's shoulder joint, and θ3 indicates the user The elbow joint, θ4 indicates the user's wrist joint. Specifically, each rigid cylinder corresponding to each user limb has different movable distances/angles in different degrees of freedom, and the first ergonomic model of the human arm includes at least one first range of rotation directions of the limb. The degrees of freedom of the different limbs of the user can be exemplarily represented as pitch, yaw, and yaw. Generally, the physical movements of the different limbs of the user can be performed based on these three degrees of freedom. express.
例如,用户肩关节的内收或外展动作可以通过pitch表示,用户肩关节的前屈或后伸动作可以通过roll表示,用户肩关节的内旋或外旋动作可以通过yaw表示。For example, the adduction or abduction action of the user's shoulder joint can be represented by pitch, the forward bending or backward extension action of the user's shoulder joint can be represented by roll, and the internal rotation or external rotation action of the user's shoulder joint can be represented by yaw.
又如,用户肘关节的伸屈动作可以通过pitch表示,用户肘关节的旋转动作可以通过roll表示。For another example, the extension and flexion action of the user's elbow joint can be represented by pitch, and the rotation action of the user's elbow joint can be represented by roll.
又如,用户腕关节的伸屈动作可以通过pitch表示,用户腕关节的尺偏动作可以通过roll表示。For another example, the extension and flexion action of the user's wrist joint can be represented by pitch, and the ulnar deflection action of the user's wrist joint can be represented by roll.
示例性的,此处以包含IMU数据采集装置的设备为手表为例,手表一般佩戴在用户的腕关节。其中,通过所构建的第一人体手臂工学模型,可以得到用户的腕关节的肢体动作的第一值域包括伸屈[-35°,50°]、尺偏[-25°,30°]。例如,如果图像采集装置得到的第一约束条件指示用户的腕关节姿态角度为伸屈角度时,可以将该伸屈角度输入至第一人体手臂工学模型,形成关于该伸屈角度的三元不等式表示为:Exemplarily, the device including the IMU data collection device is taken as an example of a watch, and the watch is generally worn on the user's wrist joint. Wherein, through the constructed first human arm engineering model, the first range of limb movements of the user's wrist joint can be obtained, including extension and flexion [-35°, 50°] and ulnar deviation [-25°, 30°]. For example, if the first constraint condition obtained by the image acquisition device indicates that the posture angle of the user's wrist joint is an extension and flexion angle, the extension and flexion angle can be input into the first human arm engineering model to form a ternary inequality about the extension and flexion angle. Expressed as:
min≤pitch≤max;min≤pitch≤max;
其中,min指示第一值域中伸屈动作的最小值即-35°,pitch这一自由度对应的伸屈角度取值,max指示第一值域中伸屈动作的最大值即50°。Among them, min indicates the minimum value of the extension and flexion action in the first value range, that is -35°, pitch is the value of the extension and flexion angle corresponding to this degree of freedom, and max indicates the maximum value of the extension and flexion action in the first value range, that is, 50°.
在该三元不等式中,如果第一约束条件指示用户的腕关节姿态角度为伸屈角度超出该 第一值域时,可以基于该第一值域对该第一约束条件进行更新,将该超出值更新为第一值域中的最小值或最大值(即-35°或50°)。显然,如果针对检测的用户肢体的自由度不是该pitch时,可以通过其它的自由度替换该三元不等式中的pitch,可以根据具体的应用场景进行灵活替换实现,此处不做限定。In the ternary inequality, if the first constraint condition indicates that the posture angle of the user's wrist joint is an extension and flexion angle that exceeds the first value range, the first constraint condition can be updated based on the first value range, and the excess The value is updated to the minimum or maximum value in the first range (ie -35° or 50°). Obviously, if the degree of freedom of the detected user limb is not the pitch, the pitch in the ternary inequality can be replaced by other degrees of freedom, which can be flexibly implemented according to specific application scenarios, which is not limited here.
在一种可能的实现方式中,如果电子设备执行图3所示的初始化过程,则可以根据图3中步骤S101得到的初始化图像数据进行计算,建立初始人体手臂工学模型,并且在用户与图像采集装置之间的相对信息发生改变时,才将该初始人体手臂工学模型进行更新,得到上述第一人体手臂工学模型。In a possible implementation manner, if the electronic device performs the initialization process shown in FIG. 3 , it can perform calculations according to the initialization image data obtained in step S101 in FIG. 3 to establish an initial human arm engineering model, and the user and the image capture When the relative information between the devices changes, the initial human arm engineering model is updated to obtain the above-mentioned first human arm engineering model.
可选地,电子设备在图3中执行的用户初始化过程的第一时长可以包括第三时刻集合,其中,在第三时刻集合之后的第二时刻集合内,电子设备在第一图像数据中进行CV关键点识别,可以得到相关参数(例如肩宽、眼间距、耳间距等)以确定第一相对信息,并在第一相对信息不同于初始相对信息时,该电子设备根据该第一相对信息更新该初始人体手臂工学模型,得到该第一人体手臂工学模型。其中,可以基于初始相对信息与第一相对信息的差别(例如朝向角度的差值、距离的差值等),将初始人体手臂工学模型中的整体坐标值按照该差别指示的方向进行偏移修正,以得到该第一人体手臂工学模型。Optionally, the first duration of the user initialization process performed by the electronic device in FIG. 3 may include a third time set, wherein, in the second time set after the third time set, the electronic device performs the first image data in the first image data. CV key point identification, relevant parameters (such as shoulder width, eye spacing, ear spacing, etc.) can be obtained to determine the first relative information, and when the first relative information is different from the initial relative information, the electronic device will be based on the first relative information. The initial human arm engineering model is updated to obtain the first human arm engineering model. Wherein, based on the difference between the initial relative information and the first relative information (for example, the difference in the orientation angle, the difference in the distance, etc.), the overall coordinate value in the initial ergonomic model of the human arm can be offset and corrected according to the direction indicated by the difference. , to obtain the first ergonomic model of the human arm.
需要说明的是,第三时刻集合位于第一时间段之前。It should be noted that the third time set is located before the first time period.
其中,该电子设备还可以确定图像采集装置在位于第二时刻集合之前的第三时刻集合内,所采集得到的初始图像数据,并根据该初始图像数据构建初始人体手臂工学模型。由于用户走动或用户转身等触发初始相对信息不同于第一相对信息时,即用户(例如用户躯干或用户身体)与图像采集装置在第二时刻集合的相对信息与用户(例如用户躯干或用户身体)与图像采集装置在第三时刻集合的相对信息相比,发生改变时,使用该第一相对信息更新该初始人体手臂工学模型,得到该第一人体手臂工学模型,实现对光标控制的进一步优化。Wherein, the electronic device can also determine the initial image data collected by the image acquisition device in the third time set before the second time set, and construct the initial human arm engineering model according to the initial image data. When the initial relative information is different from the first relative information due to the user walking or the user turning around, that is, the relative information collected by the user (such as the user's torso or the user's body) and the image acquisition device at the second moment is different from the user (such as the user's torso or the user's body) ) when compared with the relative information collected by the image acquisition device at the third moment, when a change occurs, use the first relative information to update the initial ergonomic model of the human arm, obtain the first ergonomic model of the human arm, and further optimize the cursor control .
需要说明的是,在用户初始化过程中根据初始化图像数据进行计算,建立初始人体手臂工学模型的实现过程,与前述根据步骤S202所得到的第一图像数据确定第一人体手臂工学模型的实现过程类似,此处不再赘述。It should be noted that, in the user initialization process, the calculation is performed according to the initialization image data to establish the implementation process of the initial human arm engineering model, which is similar to the aforementioned implementation process of determining the first human arm engineering model according to the first image data obtained in step S202. , and will not be repeated here.
示例性的,此处对用户与图像采集装置之间的相对信息发生改变时,才将该初始人体手臂工学模型进行更新,得到上述第一人体手臂工学模型的场景进行介绍。Exemplarily, when the relative information between the user and the image acquisition device is changed, the initial human arm ergonomic model is updated, and the scene of obtaining the above-mentioned first ergonomic arm ergonomic model is introduced.
其中,电子设备可以根据该第一图像数据确定用户与图像采集装置之间的第一相对信息,并且在该第一相对信息不同于初始相对信息时,触发执行上述更新的过程,以得到第一人体手臂工学模型。例如,肩关节至少包括有伸屈、外展内收两个自由度,以初始相对信息指示用户朝向为正面面对图像采集装置作为示例,此时,该初始相对信息可以建立初始人体手臂工学模型,且该初始人体手臂工学模型的第一值域包括肩关节的两个自由度上的参数,包括外展内收这一自由度的移动范围值域可以为[0,90°],伸屈这一自由度的移动范围值域可以为[0,0°];以第一相对信息指示用户朝向为90度侧向对着图像采集装置作为示例,此时可以根据该第一相对信息更新该初始人体手臂工学模型,即根据初始相对信息和第一相对信息的差值对该初始人体手臂工学模型的坐标进行平移/旋转操作,得到更 新后的第一人体手臂工学模型。在该示例中,第一人体手臂工学模型的第二值域也包括肩关节的两个自由度上的参数,包括外展内收这一自由度的移动范围值域可以为[0,0°],伸屈这一自由度的移动范围值域可以为[0,90°]。Wherein, the electronic device can determine the first relative information between the user and the image acquisition device according to the first image data, and when the first relative information is different from the initial relative information, trigger the execution of the above-mentioned updating process, so as to obtain the first relative information. Human arm ergonomic model. For example, the shoulder joint includes at least two degrees of freedom, extension and flexion, abduction and adduction, and the initial relative information indicates that the user is facing the image acquisition device as an example. At this time, the initial relative information can establish an initial human arm engineering model , and the first value range of the initial human arm engineering model includes the parameters on the two degrees of freedom of the shoulder joint, including the movement range of abduction and adduction. The range of movement range of this degree of freedom can be [0, 0°]; taking the first relative information indicating that the user is facing 90 degrees laterally facing the image acquisition device as an example, at this time, the first relative information can be used to update the The initial human arm engineering model is to perform translation/rotation operations on the coordinates of the initial human arm engineering model according to the difference between the initial relative information and the first relative information to obtain the updated first human arm engineering model. In this example, the second value range of the first human arm ergonomic model also includes parameters on the two degrees of freedom of the shoulder joint, and the range of movement range including the abduction and adduction degree of freedom may be [0, 0° ], the range of movement range of this degree of freedom can be [0, 90°].
类似的,电子设备根据第一图像数据确定第一相对信息的过程也可以参考确定初始相对信息的过程,此处不再赘述。Similarly, for the process of determining the first relative information by the electronic device according to the first image data, reference may also be made to the process of determining the initial relative information, which will not be repeated here.
进一步地,在该初始相对信息与该第一相对信息相同时,电子设备将该初始人体手臂工学模型确定为该第一人体手臂工学模型。其中,在初始相对信息与第一相对信息相同时,即用户与图像采集装置在第二时刻集合的相对信息与用户与图像采集装置在第三时刻集合的相对信息相比,未发生改变时,可以将该初始人体手臂工学模型确定为该第一人体手臂工学模型,无需执行人体手臂工学模型的更新,提升处理效率。Further, when the initial relative information is the same as the first relative information, the electronic device determines the initial human arm engineering model as the first human arm engineering model. Wherein, when the initial relative information is the same as the first relative information, that is, when the relative information gathered by the user and the image acquisition device at the second moment is not changed compared with the relative information gathered by the user and the image acquisition device at the third moment, The initial human arm engineering model can be determined as the first human arm engineering model, and there is no need to update the human arm engineering model, thereby improving processing efficiency.
S205.电子设备对所述目标IMU数据进行坐标转换处理,得到控制信息。S205. The electronic device performs coordinate transformation processing on the target IMU data to obtain control information.
本实施例中,电子设备根据步骤S204得到的目标IMU数据进行坐标转换处理,得到控制信息,其中,该控制信息用于控制显示装置中的光标。例如,该控制信息用于控制显示装置中的光标执行相关操作,例如移动、拖动、放大、点击等。In this embodiment, the electronic device performs coordinate conversion processing according to the target IMU data obtained in step S204 to obtain control information, where the control information is used to control the cursor in the display device. For example, the control information is used to control the cursor in the display device to perform related operations, such as moving, dragging, zooming in, clicking, and the like.
示例性的,当该控制信息用于控制显示装置中的光标进行移动时,该控制信息可以为光标在显示装置的二维显示平面中具体的坐标信息(X,Y)。当该控制信息用于控制显示装置中的光标执行拖动、放大、点击时,该控制信息可以是对应的手势动作对应的标识(例如执行拖动对应标识1,执行放大对应标识2,执行点击对应标识3等);其中,在步骤105中,电子设备可以依据预设的神经网络分类器确定目标IMU数据是否是对应类别的手势动作,若是,则将对应手势动作的标识作为控制信息,以实现对显示装置中的光标执行相关操作。Exemplarily, when the control information is used to control the cursor in the display device to move, the control information may be specific coordinate information (X, Y) of the cursor on the two-dimensional display plane of the display device. When the control information is used to control the cursor in the display device to perform dragging, zooming, and clicking, the control information may be an identifier corresponding to the corresponding gesture action (for example, performing dragging corresponds to marker 1, performing zooming corresponding to marker 2, performing click Corresponding identification 3, etc.); wherein, in step 105, the electronic device can determine whether the target IMU data is the gesture action of the corresponding category according to the preset neural network classifier, and if so, the identification of the corresponding gesture action is used as control information, to Implements related operations on the cursor in the display device.
在一种可能实现方式中,在步骤S205中,电子设备对该目标IMU数据进行坐标转换处理,得到控制信息的过程可以包括:该电子设备根据该第一图像数据确定用户(例如用户躯干或用户身体)与该图像采集装置之间的第一相对信息,并根据该第一相对信息确定该用户(例如用户躯干或用户身体)在该显示装置中的第一映射关系,此后,再根据该第一映射关系对该目标IMU数据进行坐标转换处理,得到该控制信息。In a possible implementation manner, in step S205, the electronic device performs coordinate transformation processing on the target IMU data, and the process of obtaining the control information may include: the electronic device determines the user (for example, the user's torso or the user's torso) according to the first image data. body) and the first relative information between the image acquisition device, and determine the first mapping relationship of the user (such as the user's torso or the user's body) in the display device according to the first relative information, and then determine the first mapping relationship according to the first relative information. A mapping relationship is performed on the target IMU data to perform coordinate transformation processing to obtain the control information.
可选地,该第一相对信息包括距离、站位朝向等参数,实现过程可以参考步骤S204中的内容,此处不再赘述。Optionally, the first relative information includes parameters such as distance and station orientation, and the implementation process may refer to the content in step S204, which will not be repeated here.
具体地,由于对光标控制的过程中,用户有可能会发生移动,导致用户与图像采集装置之间的相对信息有可能发生改变。因此,可以根据第一图像数据所确定的第一相对信息,进一步确定用户(例如用户躯干或用户身体)在该显示装置中的第一映射关系,并将该第一映射关系作为控制信息的处理依据,避免该相对信息改变时所产生的定位不准、光标溢出等问题。Specifically, because the user may move during the cursor control process, the relative information between the user and the image capturing device may change. Therefore, according to the first relative information determined by the first image data, the first mapping relationship of the user (for example, the user's torso or the user's body) in the display device can be further determined, and the first mapping relationship can be used as the processing of control information Based on this, problems such as inaccurate positioning and cursor overflow caused when the relative information is changed are avoided.
在一种可能的实现方式中,如果电子设备执行图3所示的初始化过程,则可以根据图3中步骤S101得到的初始化图像数据进行计算,得到用户(例如用户躯干或用户身体)在该显示装置中的初始映射关系,并且在用户与图像采集装置之间的相对信息发生改变时,才将该初始映射关系进行更新,得到上述第一初始映射关系。In a possible implementation manner, if the electronic device performs the initialization process shown in FIG. 3 , it can perform calculation according to the initialization image data obtained in step S101 in FIG. The initial mapping relationship in the device is updated, and the above-mentioned first initial mapping relationship is obtained only when the relative information between the user and the image capturing device changes.
示例性的,该电子设备还可以根据该初始图像数据确定用户在所述显示装置中的初始映射关系。此后,在初始相对信息不同于第一相对信息时,即用户与图像采集装置在第二时刻集合的相对信息与用户与图像采集装置在第三时刻集合的相对信息相比,发生改变时,使用该第一相对信息更新该初始映射关系,得到第一映射关系,实现对光标控制的进一步优化。Exemplarily, the electronic device may further determine the initial mapping relationship of the user in the display device according to the initial image data. Thereafter, when the initial relative information is different from the first relative information, that is, when the relative information collected by the user and the image capture device at the second time is compared with the relative information collected by the user and the image capture device at the third time, use The first relative information updates the initial mapping relationship to obtain the first mapping relationship, so as to further optimize the cursor control.
进一步地,在该初始相对信息与该第一相对信息相同时,该电子设备将该初始相对信息确定为第一映射关系。其中,在初始相对信息与第一相对信息相同时,即用户(例如用户躯干或用户身体)与图像采集装置在第二时刻集合的相对信息与用户与图像采集装置在第三时刻集合的相对信息相比,未发生改变时,可以将该初始映射关系确定为第一映射关系,无需执行映射关系的更新,提升处理效率。Further, when the initial relative information is the same as the first relative information, the electronic device determines the initial relative information as the first mapping relationship. Wherein, when the initial relative information is the same as the first relative information, that is, the relative information gathered by the user (for example, the user's torso or the user's body) and the image acquisition device at the second moment and the relative information gathered by the user and the image acquisition device at the third moment In contrast, when there is no change, the initial mapping relationship can be determined as the first mapping relationship, and there is no need to update the mapping relationship, thereby improving processing efficiency.
由上述内容可以看出,本申请实施例至少具备如下有益效果:电子设备使用图像采集装置在第二时刻集合内采集用户肢体动作得到的第一图像数据进行CV关键点识别,得到第一约束条件,基于该第一约束条件对IMU数据采集装置在第一时刻集合内采集用户肢体动作得到的初始IMU数据进行校准,得到目标IMU数据,此后,再基于该目标IMU数据进行坐标转换处理,得到用于控制显示装置中的光标的控制信息。其中,该第二时刻集合为该第一时刻集合的子集,即电子设备对初始IMU数据进行校准得到目标IMU数据的过程为异步校准。其中,受限于硬件计算能力限制,CV识别过程的计算时长一般远远大于IMU数据的处理时长,该异步校准的实现方式相较于实时同步校准的人机交互方式,由于无需等待较长计算时长的CV处理过程,可以有效避免显示卡顿、显示延迟等问题,使得通过异步校准的方式实现对显示装置中的光标进行控制,可以提升显示装置中的光标移动的连贯性,从而提升用户体验。It can be seen from the above content that the embodiments of the present application have at least the following beneficial effects: the electronic device uses the image acquisition device to collect the first image data obtained by the user's limb movements in the second time set to perform CV key point recognition, and obtain the first constraint condition. , based on the first constraint condition, calibrate the initial IMU data obtained by the IMU data acquisition device collecting the user's limb movements in the first time set to obtain the target IMU data, and then perform coordinate transformation processing based on the target IMU data to obtain the target IMU data. Control information for controlling the cursor in the display device. The second time set is a subset of the first time set, that is, the process in which the electronic device calibrates the initial IMU data to obtain the target IMU data is asynchronous calibration. Among them, due to the limitation of hardware computing capacity, the calculation time of the CV identification process is generally much longer than the processing time of the IMU data. Compared with the human-computer interaction method of real-time synchronous calibration, the asynchronous calibration implementation method does not need to wait for a long calculation time. The long CV processing process can effectively avoid problems such as display freezes and display delays, so that the cursor in the display device can be controlled by asynchronous calibration, which can improve the continuity of the cursor movement in the display device, thereby improving user experience. .
请参阅图11,本申请实施例还提供了一种人机交互方法,该人机交互方法可以通过一个或多个电子设备执行,其中,该方法可以包括如下模块设置。Referring to FIG. 11 , an embodiment of the present application further provides a human-computer interaction method, and the human-computer interaction method can be performed by one or more electronic devices, wherein the method can include the following module settings.
确定图像数据模块1101,用于确定并输出图像数据,该图像数据至少包括第一图像数据,对应于前述步骤S202中的实现过程;A determining image data module 1101, configured to determine and output image data, where the image data includes at least the first image data, corresponding to the implementation process in the aforementioned step S202;
CV关键点识别模块1102,用于对确定图像数据模块1101所输出的图像数据进行CV关键点识别,得到并输出第一约束条件,对应于前述步骤S203的实现过程;The CV key point recognition module 1102 is used to perform CV key point recognition on the image data output by the determining image data module 1101, and obtain and output the first constraint condition, which corresponds to the implementation process of the aforementioned step S203;
确定IMU数据模块1103,用于确定并输出IMU数据,该IMU数据至少包括初始IMU数据,对应于前述步骤S201中的实现过程;determining IMU data module 1103, for determining and outputting IMU data, where the IMU data at least includes initial IMU data, corresponding to the implementation process in the aforementioned step S201;
异步校准模块1104,用于至少根据第一约束条件、初始IMU数据进行校准处理,得到并输出目标IMU数据,对应于前述步骤S204中的实现过程;An asynchronous calibration module 1104, configured to perform calibration processing at least according to the first constraint condition and the initial IMU data, and obtain and output the target IMU data, corresponding to the implementation process in the aforementioned step S204;
坐标转换模块1105,用于至少根据目标IMU数据进行坐标转换处理,得到并输出控制信息,对应于前述步骤S205中的实现过程。The coordinate conversion module 1105 is configured to perform coordinate conversion processing at least according to the target IMU data, and obtain and output control information, which corresponds to the implementation process in the foregoing step S205.
可选地,在图11所示电子设备中,还可以包括如下其它模块。Optionally, the electronic device shown in FIG. 11 may further include other modules as follows.
显示装置1106,用于根据控制信息对光标进行控制;a display device 1106, configured to control the cursor according to the control information;
预处理模块1107,用于对初始IMU数据进行预处理,并将预处理后的结果输出至异步校准1104模块,其中,预处理过程可以包括波形平滑处理、去燥校准补偿处理等;The preprocessing module 1107 is used to preprocess the initial IMU data, and output the preprocessed result to the asynchronous calibration module 1104, wherein the preprocessing process may include waveform smoothing processing, de-drying calibration compensation processing, etc.;
人体手臂工学模型建立模块1108,用于根据确定图像数据模块1101输出的第一图像数据构建第一人体手臂工学模型,并将该第一人体手臂工学模型对输入至异步校准模块1104的第一约束条件进行更新,作为目标IMU数据的确定依据之一;The human arm engineering model building module 1108 is used to construct a first human arm engineering model according to the first image data output by the image data module 1101 , and input the first human arm engineering model to the first constraint of the asynchronous calibration module 1104 The condition is updated as one of the basis for determining the target IMU data;
可选地,人体手臂工学模型建立模块1108还可以用于根据确定图像数据模块1101输出的初始图像数据构建初始人体手臂工学模型,并将该初始人体手臂工学模型对输入至异步校准模块1104的第一约束条件进行更新;Optionally, the human arm engineering model building module 1108 can also be used to construct an initial human arm engineering model according to the initial image data output by the determining image data module 1101, and input the initial human arm engineering model pair to the first step of the asynchronous calibration module 1104. A constraint condition is updated;
映射关系确定模块1109,用于根据图像数据模块1101输出的第一图像数据确定第一映射关系,并将该第一映射关系向坐标转换模块1105输出,作为坐标转换处理的依据之一;The mapping relationship determination module 1109 is used to determine the first mapping relationship according to the first image data output by the image data module 1101, and output the first mapping relationship to the coordinate conversion module 1105 as one of the basis for coordinate conversion processing;
可选地,映射关系确定1109还可以用于根据确定图像数据模块1101输出的初始图像数据确定初始映射关系,并将该初始映射关系向坐标转换模块1105输出,作为坐标转换处理的依据之一;Optionally, the mapping relationship determination 1109 can also be used to determine the initial mapping relationship according to the initial image data output by the determining image data module 1101, and output the initial mapping relationship to the coordinate conversion module 1105 as one of the basis for coordinate conversion processing;
判断相对信息是否改变模块1110,用于判断第一图像数据所指示的第一相对信息与初始图像数据所指示的初始相对信息是否发生改变;The module 1110 for judging whether the relative information has changed is used to determine whether the first relative information indicated by the first image data and the initial relative information indicated by the initial image data have changed;
若改变,则判断相对信息是否改变模块1110向人体手臂工学模型建立模块1108输出该判断结果,使得人体手臂工学模型建立模块1108确定将第一人体手臂工学模型向异步校准模块1104输出;且,判断相对信息是否改变模块1110向映射关系确定模块1109输出该判断结果,使得映射关系确定模块1109输确定将第一映射关系向异步校准模块1104输出,If it is changed, the judging whether the relative information is changed or not, the module 1110 outputs the judgment result to the human arm engineering model building module 1108, so that the human arm engineering model building module 1108 determines to output the first human arm engineering model to the asynchronous calibration module 1104; The relative information change module 1110 outputs the judgment result to the mapping relationship determination module 1109, so that the mapping relationship determination module 1109 determines to output the first mapping relationship to the asynchronous calibration module 1104,
若未改变,则判断相对信息是否改变模块1110向人体手臂工学模型建立模块1108输出该判断结果,使得人体手臂工学模型建立模块1108确定将初始人体手臂工学模型向异步校准模块1104输出;且,判断相对信息是否改变模块1110向映射关系确定模块1109输出该判断结果,使得映射关系确定模块1109输确定将初始映射关系向异步校准模块1104输出。If it has not changed, the judging whether the relative information has changed the module 1110 outputs the judgment result to the human arm engineering model building module 1108, so that the human arm engineering model building module 1108 determines to output the initial human arm engineering model to the asynchronous calibration module 1104; The relative information change module 1110 outputs the judgment result to the mapping relationship determination module 1109 , so that the mapping relationship determination module 1109 determines to output the initial mapping relationship to the asynchronous calibration module 1104 .
此外,图11所示各个模块的实现过程及对应的有益效果,还可以参考前述方法实施例的描述,此处不再赘述。In addition, for the implementation process of each module shown in FIG. 11 and the corresponding beneficial effects, reference may also be made to the description of the foregoing method embodiments, which will not be repeated here.
请参阅图12,本申请实施例还提供了一种第一电子设备1200,其中,第一电子设备1200可以至少包括运动传感器1201和处理器1202。Referring to FIG. 12 , an embodiment of the present application further provides a first electronic device 1200 , where the first electronic device 1200 may at least include a motion sensor 1201 and a processor 1202 .
可选地,该第一电子设备1200还可以包括其它的部件,例如存储器、外壳、通信模块等,此处不做限定。Optionally, the first electronic device 1200 may further include other components, such as a memory, a casing, a communication module, etc., which are not limited here.
具体地,该运动传感器1201可以用于实现前述任意实施例中IMU数据采集装置的实现过程,处理器1202可以用于执行前述任意实施例中的计算、处理等实现过程,并实现对应的有益效果,此处不一一赘述。Specifically, the motion sensor 1201 can be used to implement the implementation process of the IMU data acquisition apparatus in any of the foregoing embodiments, and the processor 1202 can be used to perform the calculation, processing and other implementation processes in any of the foregoing embodiments, and achieve corresponding beneficial effects , will not be repeated here.
请参阅图13,本申请实施例还提供了一种第二电子设备1300,其中,第二电子设备1300可以至少包括摄像头1301和显示屏1302。Referring to FIG. 13 , an embodiment of the present application further provides a second electronic device 1300 , where the second electronic device 1300 may at least include a camera 1301 and a display screen 1302 .
可选地,该第二电子设备1300还可以包括其它的部件,例如存储器、外壳、通信模块等,此处不做限定。Optionally, the second electronic device 1300 may also include other components, such as a memory, a casing, a communication module, etc., which are not limited here.
具体地,该摄像头1301可以用于实现前述任意实施例中图像采集装置的实现过程,显示屏1302可以用于执行前述任意实施例中的显示装置的实现过程,并实现对应的有益效果, 此处不一一赘述。Specifically, the camera 1301 can be used to implement the implementation process of the image acquisition device in any of the foregoing embodiments, and the display screen 1302 can be used to implement the implementation process of the display device in any of the foregoing embodiments, and achieve corresponding beneficial effects, here Not to repeat them one by one.
本申请提供了一种电子设备,该电子设备与存储器耦合,用于读取并执行所述存储器中存储的指令,使得所述电子设备实现前述图3至图11任一实施方式中由电子设备执行的方法的步骤。在一种可能的设计中,该电子设备为芯片或片上系统。The present application provides an electronic device, which is coupled to a memory for reading and executing instructions stored in the memory, so that the electronic device implements the electronic device in any of the foregoing embodiments in FIG. 3 to FIG. 11 . The steps of the method to perform. In one possible design, the electronic device is a chip or a system on a chip.
本申请提供了一种芯片系统,该芯片系统包括处理器,用于支持电子设备诶设备实现上述方面中所涉及的功能,例如,例如发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,所述芯片系统还包括存储器,所述存储器,用于保存必要的程序指令和数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。The present application provides a chip system, the chip system includes a processor for supporting an electronic device to implement the functions involved in the above aspects, for example, for example, sending or processing the data and/or information involved in the above method. In a possible design, the chip system further includes a memory for storing necessary program instructions and data. The chip system may be composed of chips, or may include chips and other discrete devices.
本申请还提供了一种处理器,用于与存储器耦合,用于执行上述各实施例中任一实施例中涉及电子设备的方法和功能。The present application also provides a processor, which is coupled to the memory and configured to execute the methods and functions related to the electronic device in any of the foregoing embodiments.
本申请还提供了一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被计算机执行时实现上述任一方法实施例中与电子设备相关的方法流程。对应的,该计算机可以为上述电子设备。The present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a computer, implements a method process related to an electronic device in any of the foregoing method embodiments. Correspondingly, the computer may be the above electronic device.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the system, device and unit described above may refer to the corresponding process in the foregoing method embodiments, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况 下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。The terms "first", "second" and the like in the description and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the terms used in this way can be interchanged under appropriate circumstances, and this is only a distinguishing manner adopted when describing objects with the same attributes in the embodiments of the present application. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, product or device comprising a series of elements is not necessarily limited to those elements, but may include no explicit or other units inherent to these processes, methods, products, or devices.
本申请各实施例中提供的消息/帧/信息、模块或单元等的名称仅为示例,可以使用其他名称,只要消息/帧/信息、模块或单元等的作用相同即可。The names of messages/frames/information, modules or units, etc. provided in the embodiments of the present application are only examples, and other names may be used, as long as the functions of the messages/frames/information, modules or units are the same.
在本申请实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本发明。在本申请实施例中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,在本申请的描述中,除非另有说明,“/”表示前后关联的对象是一种“或”的关系,例如,A/B可以表示A或B;本申请中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。The terms used in the embodiments of the present application are only for the purpose of describing specific embodiments, and are not intended to limit the present invention. As used in the embodiments of this application, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that, in the description of this application, unless otherwise specified, "/" indicates that the associated objects are in an "or" relationship, for example, A/B can indicate A or B; in this application, "and" "/or" is just an association relationship that describes an associated object, which means that there can be three kinds of relationships, for example, A and/or B, which can mean that A exists alone, A and B exist at the same time, and B exists alone. where A and B can be singular or plural.
取决于语境,如在此所使用的词语“如果”或“若”可以被解释成为“在……时”或“当……时”或“响应于确定”或“响应于检测”。类似地,取决于语境,短语“如果确定”或“如果检测(陈述的条件或事件)”可以被解释成为“当确定时”或“响应于确定”或“当检测(陈述的条件或事件)时”或“响应于检测(陈述的条件或事件)”。Depending on the context, the words "if" or "if" as used herein may be interpreted as "at" or "when" or "in response to determining" or "in response to detecting." Similarly, the phrases "if determined" or "if detected (the stated condition or event)" can be interpreted as "when determined" or "in response to determining" or "when detected (the stated condition or event)," depending on the context )" or "in response to detection (a stated condition or event)".
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand: The technical solutions described in the embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the scope of the technical solutions of the embodiments of the present application.

Claims (33)

  1. 一种人机交互方法,应用于包括运动传感器、摄像头、处理器和显示屏的人机交互系统中,其特征在于,所述方法包括:A human-computer interaction method, which is applied to a human-computer interaction system including a motion sensor, a camera, a processor and a display screen, is characterized in that, the method comprises:
    所述运动传感器获取第一时间段内第一采样频率的初始运动传感数据,所述初始运动传感数据是由用户肢体动作触发的;The motion sensor acquires initial motion sensing data of a first sampling frequency within a first time period, and the initial motion sensing data is triggered by a user's limb movement;
    所述摄像头获取所述第一时间段内第二采样频率的第一图像数据,所述第二采样频率小于所述第一采样频率,所述第一图像数据包括所述用户肢体动作信息;obtaining, by the camera, first image data at a second sampling frequency within the first time period, where the second sampling frequency is less than the first sampling frequency, and the first image data includes the user's limb movement information;
    所述处理器获取第一约束条件,所述第一约束条件是通过对所述第一图像数据进行计算机视觉CV处理得到的;obtaining, by the processor, a first constraint condition obtained by performing computer vision CV processing on the first image data;
    所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据;The processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data;
    所述处理器根据所述目标运动传感数据得到控制信息,所述控制信息用于控制所述显示屏。The processor obtains control information according to the target motion sensing data, where the control information is used to control the display screen.
  2. 根据权利要求1所述的方法,其特征在于,所述第一约束条件是通过对所述第一图像数据进行计算机视觉CV处理中的人体骨架关键点识别得到的,所述第一约束条件包括三维空间方向角信息。The method according to claim 1, wherein the first constraint condition is obtained by recognizing human skeleton key points in computer vision CV processing on the first image data, and the first constraint condition comprises: 3D space orientation angle information.
  3. 根据权利要求1或2所述的方法,其特征在于,所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据,具体包括:The method according to claim 1 or 2, wherein the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data, which specifically includes:
    所述处理器根据所述第一约束条件映射得到校准数据,并基于所述校准数据拟合得到第一曲线;The processor maps and obtains calibration data according to the first constraint condition, and obtains a first curve by fitting based on the calibration data;
    所述处理器根据所述初始运动传感数据拟合得到第二曲线,并对所述第一曲线和所述第二曲线进行加权平均处理,得到第三曲线;The processor obtains a second curve by fitting according to the initial motion sensing data, and performs weighted average processing on the first curve and the second curve to obtain a third curve;
    所述处理器在所述第三曲线中确定校准后的运动传感数据;the processor determines calibrated motion sensing data in the third curve;
    所述处理器根据姿态解算算法处理所述校准后的运动传感数据,得到所述目标运动传感数据。The processor processes the calibrated motion sensing data according to an attitude calculation algorithm to obtain the target motion sensing data.
  4. 根据权利要求1至3任一项所述的方法,其特征在于,所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据,具体包括:The method according to any one of claims 1 to 3, wherein the processor calibrates the initial motion sensing data according to the first constraint condition to obtain target motion sensing data, which specifically includes:
    所述处理器根据姿态结算算法处理所述初始运动传感数据,得到第一姿态角数据;The processor processes the initial motion sensing data according to an attitude settlement algorithm to obtain first attitude angle data;
    所述处理器根据所述第一姿态角数据拟合得到第四曲线,并根据所述第一约束条件拟合得到第五曲线;The processor obtains a fourth curve by fitting according to the first attitude angle data, and obtains a fifth curve by fitting according to the first constraint condition;
    所述处理器对所述第四曲线和所述第五曲线进行加权平均处理,得到第六曲线;The processor performs weighted average processing on the fourth curve and the fifth curve to obtain a sixth curve;
    所述处理器在所述第六曲线中确定所述目标运动传感数据。The processor determines the target motion sensing data in the sixth curve.
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述控制信息为对所述目标运动传感数据进行坐标转换得到的坐标数据,所述坐标数据用于控制所述显示屏中光标的显示位置,或者,The method according to any one of claims 1 to 4, wherein the control information is coordinate data obtained by performing coordinate transformation on the target motion sensing data, and the coordinate data is used to control the display screen the display position of the cursor in , or,
    所述控制信息为对所述目标运动传感数据进行映射得到的手势标识结果,所述手势标识结果用于操作所述显示屏的界面元素。The control information is a gesture identification result obtained by mapping the target motion sensing data, and the gesture identification result is used to operate an interface element of the display screen.
  6. 根据权利要求1至5任一项所述的方法,其特征在于,在所述处理器根据所述第一约束条件校准所述初始运动传感数据之前,所述方法还包括:所述处理器根据时间差对齐所述第一约束条件和所述初始运动传感数据。The method according to any one of claims 1 to 5, wherein before the processor calibrates the initial motion sensing data according to the first constraint condition, the method further comprises: the processor The first constraint and the initial motion sensing data are aligned according to a time difference.
  7. 根据权利要求6所述的方法,其特征在于,所述时间差是通过所述第一时间段之前的初始化过程计算得到的,所述方法还包括:The method according to claim 6, wherein the time difference is calculated through an initialization process before the first time period, and the method further comprises:
    所述显示屏显示第一提示信息,用于提示用户作出指定肢体动作;The display screen displays first prompt information for prompting the user to make a specified physical action;
    所述运动传感器获取所述初始化过程中的运动传感数据,所述初始化过程中的运动传感数据是由所述用户作出的所述指定肢体动作触发的;The motion sensor acquires motion sensing data in the initialization process, and the motion sensing data in the initialization process is triggered by the specified physical action made by the user;
    所述摄像头获取所述初始化过程中的图像数据,所述初始化过程中的图像数据包括所述用户作出的所述指定肢体动作信息;The camera acquires image data in the initialization process, and the image data in the initialization process includes information about the specified limb movements made by the user;
    所述处理器根据所述初始化过程中的运动传感数据的信号特征与所述初始化过程中的图像数据的信号特征确定所述时间差。The processor determines the time difference according to the signal characteristics of the motion sensing data in the initialization process and the signal characteristics of the image data in the initialization process.
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:The method according to claim 7, wherein the method further comprises:
    所述处理器根据所述初始化过程中的图像数据确定所述用户与所述摄像头之间的初始相对信息。The processor determines initial relative information between the user and the camera according to the image data in the initialization process.
  9. 根据权利要求8所述的方法,其特征在于,所述处理器根据所述第一约束条件校准所述初始运动传感数据,得到目标运动传感数据包括:The method according to claim 8, wherein the processor calibrates the initial motion sensing data according to the first constraint condition, and obtaining the target motion sensing data comprises:
    所述处理器根据所述初始相对信息确定初始人体手臂工学模型,所述初始人体手臂工学模型包括至少一个肢体移动角度的第一值域;The processor determines an initial human arm engineering model according to the initial relative information, and the initial human arm engineering model includes at least one first value range of a limb movement angle;
    所述处理器根据所述初始人体手臂工学模型更新所述第一约束条件,得到更新后的第一约束条件;The processor updates the first constraint condition according to the initial human arm engineering model to obtain the updated first constraint condition;
    所述处理器根据所述更新后的第一约束条件校准所述初始运动传感数据,得到所述目标运动传感数据。The processor calibrates the initial motion sensing data according to the updated first constraint condition to obtain the target motion sensing data.
  10. 根据权利要求9所述的方法,其特征在于,所述处理器根据所述初始人体手臂工学模型更新所述第一约束条件,得到更新后的第一约束条件包括:The method according to claim 9, wherein the processor updates the first constraint condition according to the initial human arm engineering model, and obtaining the updated first constraint condition comprises:
    所述处理器根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;The processor determines first relative information between the user and the camera according to the first image data;
    在所述第一相对信息不同于所述初始相对信息时,所述处理器根据所述第一相对信息更新所述初始人体手臂工学模型,得到第一人体手臂工学模型;When the first relative information is different from the initial relative information, the processor updates the initial human arm engineering model according to the first relative information to obtain a first human arm engineering model;
    所述处理器根据所述第一人体手臂工学模型更新所述第一约束条件,得到所述更新后的第一约束条件。The processor updates the first constraint condition according to the first human arm engineering model to obtain the updated first constraint condition.
  11. 根据权利要求8所述的方法,其特征在于,所述处理器根据所述目标运动传感数据得到控制信息包括:The method according to claim 8, wherein the control information obtained by the processor according to the target motion sensing data comprises:
    所述处理器根据所述初始相对信息确定用户在所述显示装置中的初始映射关系;The processor determines the initial mapping relationship of the user in the display device according to the initial relative information;
    所述处理器根据所述初始映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息。The processor performs coordinate transformation processing on the target motion sensing data according to the initial mapping relationship to obtain the control information.
  12. 根据权利要求11所述的方法,其特征在于,所述处理器根据所述初始映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息包括:The method according to claim 11, wherein the processor performs coordinate transformation processing on the target motion sensing data according to the initial mapping relationship, and obtaining the control information comprises:
    所述处理器根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;The processor determines first relative information between the user and the camera according to the first image data;
    在所述第一相对信息不同于所述初始相对信息时,所述处理器根据所述第一相对信息更新所述初始映射关系,得到第一映射关系;When the first relative information is different from the initial relative information, the processor updates the initial mapping relationship according to the first relative information to obtain a first mapping relationship;
    所述处理器根据所述第一映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息。The processor performs coordinate transformation processing on the target motion sensing data according to the first mapping relationship to obtain the control information.
  13. 根据权利要求1至12任一项所述的方法,其特征在于,所述运动传感器包括加速度计、陀螺仪、磁力计中的一个至多个传感器的传感单元。The method according to any one of claims 1 to 12, wherein the motion sensor comprises a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
  14. 根据权利要求1至13任一项所述的方法,其特征在于,所述摄像头包括深度摄像头、非深度摄像头中的一种至多种摄像头。The method according to any one of claims 1 to 13, wherein the camera comprises one or more cameras selected from a depth camera and a non-depth camera.
  15. 一种第一电子设备,包括运动传感器和处理器,其特征在于:A first electronic device, comprising a motion sensor and a processor, is characterized in that:
    所述运动传感器用于获取第一时间段内第一采样频率的初始运动传感数据,所述初始运动传感数据是由用户肢体动作触发的;The motion sensor is used to acquire initial motion sensing data of a first sampling frequency within a first time period, where the initial motion sensing data is triggered by a user's limb movements;
    所述处理器用于根据获取的第一约束条件,校准所述初始运动传感数据,得到目标运动传感数据,其中,所述第一约束条件是通过对摄像头获取的所述第一时间段内第二采样频率的第一图像数据进行计算机视觉CV处理得到的,所述第二采样频率小于所述第一采样频率,所述第一图像数据包括所述用户肢体动作信息;The processor is configured to calibrate the initial motion sensing data to obtain target motion sensing data according to the acquired first constraint condition, wherein the first constraint condition is obtained by the camera within the first time period The first image data of the second sampling frequency is obtained by performing computer vision CV processing, the second sampling frequency is less than the first sampling frequency, and the first image data includes the user's limb movement information;
    所述处理器还用于根据所述目标运动传感数据得到控制信息,所述控制信息用于控制显示屏的显示内容;The processor is further configured to obtain control information according to the target motion sensing data, where the control information is used to control the display content of the display screen;
    其中,所述摄像头和所述显示屏包含于不同于所述第一电子设备的第二电子设备。Wherein, the camera and the display screen are included in a second electronic device different from the first electronic device.
  16. 根据权利要求15所述的第一电子设备,其特征在于,所述处理器,具体用于:The first electronic device according to claim 15, wherein the processor is specifically configured to:
    根据所述第一约束条件映射得到校准数据,并基于所述校准数据拟合得到第一曲线;The calibration data is obtained by mapping according to the first constraint condition, and a first curve is obtained by fitting based on the calibration data;
    根据所述初始运动传感数据拟合得到第二曲线,并对所述第一曲线和所述第二曲线进行加权平均处理,得到第三曲线;A second curve is obtained by fitting according to the initial motion sensing data, and weighted average processing is performed on the first curve and the second curve to obtain a third curve;
    在所述第三曲线中确定校准后的运动传感数据;determining calibrated motion sensing data in the third curve;
    根据姿态解算算法处理所述校准后的运动传感数据,得到所述目标运动传感数据。The calibrated motion sensing data is processed according to an attitude calculation algorithm to obtain the target motion sensing data.
  17. 根据权利要求15或16所述的第一电子设备,其特征在于,所述处理器,具体用于:The first electronic device according to claim 15 or 16, wherein the processor is specifically configured to:
    根据姿态结算算法处理所述初始运动传感数据,得到第一姿态角数据;Process the initial motion sensing data according to the attitude settlement algorithm to obtain the first attitude angle data;
    根据所述第一姿态角数据拟合得到第四曲线,并根据所述第一约束条件拟合得到第五曲线;A fourth curve is obtained by fitting according to the first attitude angle data, and a fifth curve is obtained by fitting according to the first constraint;
    对所述第四曲线和所述第五曲线进行加权平均处理,得到第六曲线;performing weighted average processing on the fourth curve and the fifth curve to obtain a sixth curve;
    在所述第六曲线中确定所述目标运动传感数据。The target motion sensing data is determined in the sixth curve.
  18. 根据权利要求15至17任一项所述的第一电子设备,其特征在于,所述处理器,还用于:The first electronic device according to any one of claims 15 to 17, wherein the processor is further configured to:
    根据时间差对齐所述第一约束条件和所述初始运动传感数据。The first constraint and the initial motion sensing data are aligned according to a time difference.
  19. 根据权利要求18所述的第一电子设备,其特征在于,所述时间差是通过所述第一时间段之前的初始化过程计算得到的;The first electronic device according to claim 18, wherein the time difference is calculated through an initialization process before the first time period;
    所述运动传感器,还用于获取所述初始化过程中的运动传感数据,所述初始化过程中 的运动传感数据是由所述用户作出的指定肢体动作触发的;The motion sensor is also used to obtain the motion sensing data in the initialization process, and the motion sensing data in the initialization process is triggered by the specified limb movements made by the user;
    所述处理器,还用于根据所述初始化过程中的运动传感数据的信号特征与所述初始化过程中的图像数据的信号特征确定所述时间差,其中,所述初始化过程中的图像数据为所述摄像头在所述初始化过程中获取的,所述初始化过程中的图像数据包括所述用户作出的所述指定肢体动作信息。The processor is further configured to determine the time difference according to the signal characteristics of the motion sensing data in the initialization process and the signal characteristics of the image data in the initialization process, wherein the image data in the initialization process is: The camera is acquired during the initialization process, and the image data during the initialization process includes the information on the designated body movements made by the user.
  20. 根据权利要求19所述的第一电子设备,其特征在于,所述处理器,还用于:The first electronic device according to claim 19, wherein the processor is further configured to:
    根据所述初始化过程中的图像数据确定所述用户与所述摄像头之间的初始相对信息。The initial relative information between the user and the camera is determined according to the image data in the initialization process.
  21. 根据权利要求20所述的第一电子设备,其特征在于,所述处理器,具体用于:The first electronic device according to claim 20, wherein the processor is specifically configured to:
    根据所述初始相对信息确定初始人体手臂工学模型,所述初始人体手臂工学模型包括至少一个肢体移动角度的第一值域;Determine an initial human arm engineering model according to the initial relative information, where the initial human arm engineering model includes at least one first range of limb movement angles;
    根据所述初始人体手臂工学模型更新所述第一约束条件,得到更新后的第一约束条件;Update the first constraint condition according to the initial human arm engineering model to obtain the updated first constraint condition;
    根据所述更新后的第一约束条件校准所述初始运动传感数据,得到所述目标运动传感数据。The initial motion sensing data is calibrated according to the updated first constraint condition to obtain the target motion sensing data.
  22. 根据权利要求21所述的第一电子设备,其特征在于,所述处理器,具体用于:The first electronic device according to claim 21, wherein the processor is specifically configured to:
    根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;determining first relative information between the user and the camera according to the first image data;
    在所述第一相对信息不同于所述初始相对信息时,根据所述第一相对信息更新所述初始人体手臂工学模型,得到第一人体手臂工学模型;When the first relative information is different from the initial relative information, updating the initial human arm engineering model according to the first relative information to obtain a first human arm engineering model;
    所述处理器根据所述第一人体手臂工学模型更新所述第一约束条件,得到所述更新后的第一约束条件。The processor updates the first constraint condition according to the first human arm engineering model to obtain the updated first constraint condition.
  23. 根据权利要求19所述的第一电子设备,其特征在于,所述处理器,还用于:The first electronic device according to claim 19, wherein the processor is further configured to:
    根据所述初始相对信息确定用户在所述显示装置中的初始映射关系;Determine the initial mapping relationship of the user in the display device according to the initial relative information;
    根据所述初始映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息。The control information is obtained by performing coordinate transformation processing on the target motion sensing data according to the initial mapping relationship.
  24. 根据权利要求23所述的第一电子设备,其特征在于,所述处理器,具体用于:The first electronic device according to claim 23, wherein the processor is specifically configured to:
    根据所述第一图像数据确定用户与所述摄像头之间的第一相对信息;determining first relative information between the user and the camera according to the first image data;
    在所述第一相对信息不同于所述初始相对信息时,根据所述第一相对信息更新所述初始映射关系,得到第一映射关系;When the first relative information is different from the initial relative information, updating the initial mapping relationship according to the first relative information to obtain a first mapping relationship;
    根据所述第一映射关系对所述目标运动传感数据进行坐标转换处理,得到所述控制信息。The control information is obtained by performing coordinate transformation processing on the target motion sensing data according to the first mapping relationship.
  25. 根据权利要求15至24任一项所述的第一电子设备,其特征在于,所述运动传感器包括加速度计、陀螺仪、磁力计中的一个至多个传感器的传感单元。The first electronic device according to any one of claims 15 to 24, wherein the motion sensor comprises a sensing unit of one or more sensors among an accelerometer, a gyroscope, and a magnetometer.
  26. 一种第二电子设备,包括摄像头和显示屏,其特征在于:A second electronic device, comprising a camera and a display screen, is characterized in that:
    所述摄像头用于获取第一时间段内第二采样频率的第一图像数据,所述第一图像数据包括用户肢体动作信息;其中,所述第一图像数据用于确定第一约束条件,且所述第一约束条件用于校准初始运动传感数据得到目标运动传感数据,且所述初始运动传感数据为第一电子设备中的运动传感器在所述第一时间段内基于第一采样频率采样得到,所述初始运动传感数据由所述用户肢体动作触发;所述第二采样频率小于所述第一采样频率;The camera is used to acquire first image data of a second sampling frequency in a first time period, the first image data includes user limb movement information; wherein the first image data is used to determine a first constraint condition, and The first constraint condition is used to calibrate the initial motion sensing data to obtain target motion sensing data, and the initial motion sensing data is that the motion sensor in the first electronic device is based on the first sampling in the first time period Frequency sampling is obtained, and the initial motion sensing data is triggered by the user's limb movements; the second sampling frequency is less than the first sampling frequency;
    所述显示屏用于显示控制信息,其中,所述控制信息基于所述目标运动传感数据得到。The display screen is used to display control information, wherein the control information is obtained based on the target motion sensing data.
  27. 根据权利要求26所述的第二电子设备,其特征在于,所述第一约束条件是通过对所述第一图像数据进行计算机视觉CV处理中的人体骨架关键点识别得到的,所述第一约束条件包括三维空间方向角信息。The second electronic device according to claim 26, wherein the first constraint condition is obtained by recognizing human skeleton key points in computer vision CV processing on the first image data, and the first Constraints include three-dimensional space orientation angle information.
  28. 根据权利要求26或27所述的第二电子设备,其特征在于,The second electronic device according to claim 26 or 27, wherein,
    所述控制信息为对所述目标运动传感数据进行坐标转换得到的坐标数据,所述坐标数据用于控制所述显示屏中光标的显示位置,或者,The control information is coordinate data obtained by performing coordinate transformation on the target motion sensing data, and the coordinate data is used to control the display position of the cursor in the display screen, or,
    所述控制信息为对所述目标运动传感数据进行映射得到的手势标识结果,所述手势标识结果用于操作所述显示屏的界面元素。The control information is a gesture identification result obtained by mapping the target motion sensing data, and the gesture identification result is used to operate an interface element of the display screen.
  29. 根据权利要求26至28任一项所述的第二电子设备,其特征在于,The second electronic device according to any one of claims 26 to 28, wherein,
    所述显示屏,还用于显示第一提示信息,用于提示用户作出指定肢体动作;The display screen is also used to display the first prompt information, used to prompt the user to make a specified physical action;
    所述摄像头,还用于在所述第一时间段之前的初始化过程中获取所述初始化过程中的图像数据,所述初始化过程中的图像数据包括所述用户作出的所述指定肢体动作信息;The camera is further configured to acquire image data in the initialization process in the initialization process before the first time period, where the image data in the initialization process includes the specified body motion information made by the user;
    其中,所述初始化过程中的图像数据的信号特征与所述初始化过程中的运动传感数据的信号特征确定所述时间差,所述时间差用于对齐所述第一约束条件和所述初始运动传感数据,所述初始化过程中的运动传感数据为所述第二电子设备在所述初始化过程中采集得到。Wherein, the signal characteristics of the image data in the initialization process and the signal characteristics of the motion sensing data in the initialization process determine the time difference, and the time difference is used to align the first constraint condition and the initial motion signal. sensing data, and the motion sensing data in the initialization process is collected by the second electronic device during the initialization process.
  30. 根据权利要求26至29任一项所述的第二电子设备,其特征在于,所述摄像头包括深度摄像头、非深度摄像头中的一种至多种摄像头。The second electronic device according to any one of claims 26 to 29, wherein the camera comprises one or more cameras selected from a depth camera and a non-depth camera.
  31. 一种计算机可读存储介质,其特征在于,所述介质存储有指令,当所述指令被计算机执行时,实现权利要求15至25中任一项所述第一电子设备所执行的方法。A computer-readable storage medium, characterized in that the medium stores instructions, and when the instructions are executed by a computer, the method executed by the first electronic device according to any one of claims 15 to 25 is implemented.
  32. 一种计算机可读存储介质,其特征在于,所述介质存储有指令,当所述指令被计算机执行时,实现权利要求26至30中任一项所述第二电子设备所执行的方法。A computer-readable storage medium, characterized in that the medium stores instructions, and when the instructions are executed by a computer, the method executed by the second electronic device in any one of claims 26 to 30 is implemented.
  33. 一种计算机程序产品,其特征在于,包括指令,当所述指令在计算机上运行时,使得计算机执行如权利要求1至14中任一项所述的方法。A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 14.
PCT/CN2022/085282 2021-04-30 2022-04-06 Human-computer interaction method and device WO2022228056A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110486465.2 2021-04-30
CN202110486465.2A CN115268619A (en) 2021-04-30 2021-04-30 Man-machine interaction method and equipment

Publications (1)

Publication Number Publication Date
WO2022228056A1 true WO2022228056A1 (en) 2022-11-03

Family

ID=83744936

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085282 WO2022228056A1 (en) 2021-04-30 2022-04-06 Human-computer interaction method and device

Country Status (2)

Country Link
CN (1) CN115268619A (en)
WO (1) WO2022228056A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105551059A (en) * 2015-12-08 2016-05-04 国网山西省电力公司技能培训中心 Power transformation simulation human body motion capturing method based on optical and inertial body feeling data fusion
CN106256394A (en) * 2016-07-14 2016-12-28 广东技术师范学院 The training devices of mixing motion capture and system
CN108106614A (en) * 2017-12-22 2018-06-01 北京轻威科技有限责任公司 A kind of inertial sensor melts algorithm with visual sensor data
CN109147058A (en) * 2018-08-31 2019-01-04 腾讯科技(深圳)有限公司 Initial method and device and storage medium for the fusion of vision inertial navigation information
CN109544638A (en) * 2018-10-29 2019-03-29 浙江工业大学 A kind of asynchronous online calibration method for Multi-sensor Fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105551059A (en) * 2015-12-08 2016-05-04 国网山西省电力公司技能培训中心 Power transformation simulation human body motion capturing method based on optical and inertial body feeling data fusion
CN106256394A (en) * 2016-07-14 2016-12-28 广东技术师范学院 The training devices of mixing motion capture and system
CN108106614A (en) * 2017-12-22 2018-06-01 北京轻威科技有限责任公司 A kind of inertial sensor melts algorithm with visual sensor data
CN109147058A (en) * 2018-08-31 2019-01-04 腾讯科技(深圳)有限公司 Initial method and device and storage medium for the fusion of vision inertial navigation information
CN109544638A (en) * 2018-10-29 2019-03-29 浙江工业大学 A kind of asynchronous online calibration method for Multi-sensor Fusion

Also Published As

Publication number Publication date
CN115268619A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
US11861070B2 (en) Hand gestures for animating and controlling virtual and graphical elements
US11531402B1 (en) Bimanual gestures for controlling virtual and graphical elements
US20220206588A1 (en) Micro hand gestures for controlling virtual and graphical elements
US20220326781A1 (en) Bimanual interactions between mapped hand regions for controlling virtual and graphical elements
US9477312B2 (en) Distance based modelling and manipulation methods for augmented reality systems using ultrasonic gloves
EP3090331B1 (en) Systems with techniques for user interface control
KR101546654B1 (en) Method and apparatus for providing augmented reality service in wearable computing environment
US6757068B2 (en) Self-referenced tracking
US9310891B2 (en) Method and system enabling natural user interface gestures with user wearable glasses
US8310537B2 (en) Detecting ego-motion on a mobile device displaying three-dimensional content
US20150220158A1 (en) Methods and Apparatus for Mapping of Arbitrary Human Motion Within an Arbitrary Space Bounded by a User's Range of Motion
JP7382994B2 (en) Tracking the position and orientation of virtual controllers in virtual reality systems
US20210068674A1 (en) Track user movements and biological responses in generating inputs for computer systems
CN113498502A (en) Gesture detection using external sensors
WO2022228056A1 (en) Human-computer interaction method and device
Park et al. A simple vision-based head tracking method for eye-controlled human/computer interface
TW201933041A (en) Virtual space positioning method and apparatus
US12013985B1 (en) Single-handed gestures for reviewing virtual content
CN114327042B (en) Detection glove, gesture tracking method, AR equipment and key pressing method
TWI826189B (en) Controller tracking system and method with six degrees of freedom
Rusnak Unobtrusive Multi-User Interaction in Group Collaborative Environments
Sorger Alternative User Interfaces

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794527

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22794527

Country of ref document: EP

Kind code of ref document: A1