WO2023179264A1 - Air input method, device and system - Google Patents

Air input method, device and system Download PDF

Info

Publication number
WO2023179264A1
WO2023179264A1 PCT/CN2023/077011 CN2023077011W WO2023179264A1 WO 2023179264 A1 WO2023179264 A1 WO 2023179264A1 CN 2023077011 W CN2023077011 W CN 2023077011W WO 2023179264 A1 WO2023179264 A1 WO 2023179264A1
Authority
WO
WIPO (PCT)
Prior art keywords
pointing device
point
fixed
air
visual sensor
Prior art date
Application number
PCT/CN2023/077011
Other languages
French (fr)
Chinese (zh)
Inventor
郑文植
何梦佳
柏忠嘉
张行
程林松
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023179264A1 publication Critical patent/WO2023179264A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0346Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0354Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of 2D relative movements between the device, or an operating part thereof, and a plane or surface, e.g. 2D mice, trackballs, pens or pucks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
    • G06F3/0354Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of 2D relative movements between the device, or an operating part thereof, and a plane or surface, e.g. 2D mice, trackballs, pens or pucks
    • G06F3/03542Light pens for emitting or receiving light

Definitions

  • the present application relates to electronic devices, and in particular, to an air-to-air input method, device and system.
  • the air-to-air input system can meet the needs of team collaboration scenarios such as teaching, meetings, and large-scale office work. It is a new terminal category for team collaboration. At present, the air-to-air input system is widely used in the financial industry, education and training industry, medical industry, etc., playing an important role in simplifying the complexity of office work in these industries.
  • the air-to-air input system includes an air-to-air input device 110 and a pointing device 120 .
  • the air-to-air input device 110 may include but is not limited to: smart screen, electronic whiteboard, projection device, conference tablet, etc.
  • the following will take a smart screen as an example for explanation.
  • the user can input on the smart screen through the pointing device 120 through the air.
  • the user's input on the smart screen through the pointing device 120 means that the pointing device 120 is not in contact with the smart screen.
  • the user generates a fixed-point trajectory in the air through the pointing device 120.
  • the user can display the fixed-point trajectory on the smart screen.
  • the distance between the pointing device 120 and the smart screen can be 50 centimeters, 1 meter, 2 meters, 3 meters or even more.
  • the shape of the fixed-point trajectory drawn by the user in the air through the pointing device 120 and the fixed-point trace displayed on the smart screen should be the same, and the size can be proportional.
  • fixed-point trajectories and fixed-point traces are 1:2 circles
  • fixed-point trajectories and fixed-point traces are 1:1 squares, etc.
  • fixed-point trajectories and fixed-point traces are 2:1 triangles, etc.
  • the above fixed-point trajectories and fixed-point traces are explained using regular graphics as examples. In practical applications, fixed-point trajectories and fixed-point traces can also be irregular graphics, numbers, characters, letters, words, etc., which are not discussed here. Specific limitations.
  • the user can complete input on the smart screen through the air at a location far away from the smart screen without having to go to the front of the smart screen for input, which provides users with great benefits. of convenience. For example, when the teacher displays the speech on the smart screen on the podium, if a student raises a question about a certain part of it, the student can use the pointing device 120 to mark the speech on the smart screen at his or her seat without having to use his or her own computer to mark the speech. Walk to your seat in front of the smart screen and mark the speech document.
  • the fixed-point trajectory of the pointing device 120 in the air is different from the one displayed on the smart screen.
  • Fixed-point traces are often different. For example, as shown in Figure 1, the fixed-point trace written clockwise by the user in the air with point A as the starting point is a standard circle. However, the fixed-point trace displayed on the smart screen starts with A' Dot the irregular shape written clockwise.
  • This application provides an air-to-air input method, device and system, which can improve the accuracy of determining fixed-point trajectories in the air.
  • the first aspect provides an air-to-air input method, including:
  • the air input device determines the starting point of the fixed point in the display area; during the three-dimensional fixed point movement of the first pointing device in the air, the position and orientation information of the first pointing device collected by the first visual sensor, the first fixed point.
  • the posture information of the first pointing device collected by the first IMU in the pointing device and the mapping relationship between the first posture information and the first pointing trajectory Determine the fixed-point trajectory of the first pointing device in the air, wherein the first attitude information includes the position and attitude information of the second pointing device collected by the second visual sensor, and the second position and attitude information collected by the second IMU.
  • Position information of the pointing device determine the fixed point trace in the display area based on the fixed point starting point and the fixed point trajectory.
  • the first visual sensor and the second visual sensor may be the same visual sensor, or they may be two different visual sensors.
  • the first visual sensor and the second visual sensor may be the same stereo camera, or they may be two different visual sensors. different stereo cameras.
  • the first visual sensor can be a stereo camera, the second visual sensor a lidar, and so on.
  • the first pointing device and the second pointing device may be the same device, or may not be the same device.
  • the first IMU and the second IMU are the same IMU.
  • the first The IMU and the second IMU are two different IMUs.
  • the first visual sensor and the second visual sensor are the same visual sensor, but the first pointing device and the second pointing device are different pointing devices.
  • the first visual sensor and the second visual sensor are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device.
  • the first visual sensor and the second visual sensor are different visual sensors, but the first pointing device and the second pointing device are the same pointing device.
  • the first visual sensor and the second visual sensor are different visual sensors, and the first pointing device and the second pointing device are also different pointing devices.
  • mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory may occur at the same time or not at the same time.
  • training to obtain the mapping relationship and using the mapping relationship to determine the fixed-point trajectory occur in the same time and space. Therefore, while using the mapping relationship to determine the fixed-point trajectory, mapping The relationship can continue to change to continuously improve the accuracy of the mapping relationship in the process of using the mapping relationship.
  • mapping relationship can be obtained through training first, and then the mapping relationship is used to determine the fixed-point trajectory.
  • mapping relationship can no longer change. . Therefore, the air-to-air input device does not need to undertake the task of training to obtain the mapping relationship, reducing the load of the air-to-air input device. Of course, according to the actual situation, the mapping relationship can still be obtained by training through the air input device.
  • the pose information of the first pointing device collected by the first visual sensor and the first IMU collection of the first pointing device are simultaneously referred to.
  • the position and attitude information of the first pointing device that is, the position and attitude information of the first pointing device collected by multi-angle sensors is used to determine the fixed point trajectory of the first pointing device in the air, thereby improving the accuracy of determining the position and attitude of the first pointing device in the air. Accuracy of fixed-point trajectories.
  • the pose information of the first pointing device collected by the first visual sensor includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, and, the first The rotation angle of the wrist part in the three-dimensional model of the pointing device relative to the user's three-dimensional model.
  • the three-dimensional model of the user and the three-dimensional model of the first pointing device are the user's three-dimensional model collected according to the first visual sensor.
  • the three-dimensional data and the three-dimensional data of the first pointing device are established.
  • the point in the three-dimensional model of the first pointing device may be any point in the three-dimensional model of the first pointing device, for example, it may be the end point, midpoint, etc. of the first pointing device.
  • the point in the three-dimensional model of the first pointing device may be the tip of the writing pen.
  • the first visual sensor collects the user's three-dimensional data and the three-dimensional data of the first pointing device to establish a three-dimensional model, and determines the first position through the user's three-dimensional model and the three-dimensional model of the first pointing device.
  • pointing device Because the three-dimensional data modeling collected through the visual sensor has the characteristics of very high accuracy, the accuracy of the collected posture information of the first pointing device is also very high. Therefore, it is determined that the first pointing device is in the air. The accuracy of the fixed-point trajectory is also very high.
  • the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
  • the specific part of the user's three-dimensional model may be any part of the user's three-dimensional model, for example, the eye part, the nose tip part, the finger tip part, etc.
  • the priority of specific parts of the user's three-dimensional model can be set based on different conditions. For example, when the user points to the display area with a finger, the tip of the finger is used first to determine the starting point of the fixed point. When the user does not point to the display area with a finger, , using the eye part first to determine the starting point of the fixed point.
  • the specific part in the three-dimensional model of the first pointing device may be any part in the three-dimensional model of the first pointing device, for example, an endpoint part, a center part, etc.
  • the starting point of the fixed point is determined based on the pose information of the IMU in the first pointing device. Therefore, the starting point of the fixed point determined based on the pose information of the IMU is often not the starting point of the fixed point desired by the user. Users often need to adjust the pointing device multiple times to reach the starting point they want, resulting in very low efficiency.
  • a visual sensor is used to determine the fixed starting point, which can more accurately and efficiently determine the fixed starting point desired by the user.
  • the three-dimensional data modeling collected by the visual sensor has the characteristics of very high accuracy, it is also accurate to determine the starting point of the fixed point through the user's three-dimensional model established by the three-dimensional data collected by the visual sensor or the three-dimensional model of the first pointing device. very high.
  • the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
  • the starting point of the fixed point is determined through the intersection of the normal vector of the eye part in the user's three-dimensional model and the display area. It not only has the characteristics of high accuracy, but also the starting point of the fixed point is wherever the user's eyes look. Very convenient and user-friendly.
  • the method before determining the fixed-point trajectory of the first pointing device in the air, the method further includes:
  • first attitude information of the second pointing device wherein the first attitude information is the position of the second pointing device when the robot arm controls the second pointing device to perform a fixed point movement in the air. posture information;
  • the neural network is trained through the first posture information and the first fixed point trajectory to obtain the mapping relationship.
  • the fixed-point trajectory generated by the robotic arm is more accurate and stable than the fixed-point trajectory generated manually.
  • Using the robotic arm to generate the fixed-point trajectory during training can obtain a more accurate mapping relationship.
  • the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera.
  • lidar, depth cameras and stereo cameras can be used as first-view sensors.
  • a monocular camera can be used. To reduce costs, however, before using a monocular camera, the monocular camera needs to be trained so that the monocular camera has the ability to acquire three-dimensional data.
  • the first visual sensor is disposed in the display area, or may be disposed outside the display area.
  • the angle between the line of sight axis of the first visual sensor and the normal line of the display area is zero.
  • the complexity of the calculation can be simplified.
  • the first visual sensor is set outside the display area, the degree of freedom in setting the first visual sensor can be increased.
  • it can also be set in the display area. outside the display area.
  • an air-to-air input device including: a processor and a display unit, the processor being connected to the display unit,
  • the processor is used to determine the starting point of a fixed point in the display area generated by the display unit.
  • the position and orientation of the first pointing device collected by the first visual sensor are Information, the posture information of the first pointing device collected by the IMU in the first pointing device, and the mapping relationship between the first posture information and the first pointing trajectory determine the first pointing device A fixed-point trajectory in the air, wherein the first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the posture information of the second pointing device collected by the second IMU; according to The fixed point starting point and the fixed point trajectory determine the fixed point trace in the display area;
  • the display unit is used to display the fixed point trace in the display area.
  • the device further includes a receiver, the receiver is configured to receive the posture information of the first pointing device collected by the first visual sensor, wherein the first visual sensor collects
  • the pose information of the first pointing device includes the three-dimensional coordinates of the point in the three-dimensional model of the first pointing device, and the three-dimensional model of the first pointing device is relative to the user's wrist in the three-dimensional model.
  • the rotation angle of the part, the three-dimensional model of the user and the three-dimensional model of the first pointing device are established based on the three-dimensional data of the user and the three-dimensional data of the first pointing device collected by the first visual sensor of.
  • the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
  • the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
  • the receiver is also used to receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is the second fixed-pointing device controlled by the robotic arm in the air. Obtained by performing fixed-point movement; the receiver is also used to receive the first attitude information of the second pointing device, wherein the first attitude information is obtained by controlling the second pointing device in the air by the robotic arm.
  • the posture information of the second fixed-point device when performing fixed-point movement; the processor is also configured to train a neural network through the first posture information and the first fixed-point trajectory to obtain the mapping relationship.
  • the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera.
  • an air-to-air input system including:
  • a first pointing device used for performing three-dimensional fixed-point movement in the air, and collecting the pose information of the first pointing device through the first IMU in the first pointing device;
  • a first visual sensor used to collect position and orientation information of the first pointing device
  • An air-to-air input device is used to determine the starting point of a fixed point in the display area generated by the air-to-air input device.
  • the first fixed point collected by the first visual sensor Determination of the mapping relationship between the posture information of the pointing device, the posture information of the first pointing device collected by the first IMU in the first pointing device, and the first posture information and the first pointing trajectory
  • the fixed-point trajectory of the first pointing device in the air wherein the first pointing device
  • One posture information includes the posture information of the second fixed-point device collected by the second visual sensor, and the posture information of the second fixed-point device collected by the second IMU; according to the fixed-point starting point and the fixed-point trajectory, it is determined fixed-point traces in the display area;
  • the air-to-air input device is also used to display the fixed-point trace in the display area.
  • the air-to-air input device can be a smart screen, or a projector with a fixed-point trace determination function, etc.
  • the display area generated by the air-to-air input device refers to the display area of the smart screen.
  • the air-to-air input device is a projector with a fixed-point trace determination function
  • the air-to-air input device The display area generated by the device refers to the projection area generated by the projector.
  • the fixed-point trace determination function is implemented by a peripheral fixed-point trace determination device and is not integrated with the projector.
  • the fixed-point trace determination function includes receiving the position and orientation information of the first pointing device collected by the first visual sensor and the position and orientation information of the first pointing device collected by the first IMU, and based on the first fixed-pointing device collected by the first visual sensor.
  • the posture information of the pointing device and the first IMU collect the posture information of the first pointing device to determine the pointing trace.
  • a projector can be used to display the fixed point traces in the projection area.
  • the pose information of the first pointing device collected by the first visual sensor includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, and, the first The rotation angle of the wrist part in the three-dimensional model of the pointing device relative to the user's three-dimensional model.
  • the three-dimensional model of the user and the three-dimensional model of the first pointing device are the user's three-dimensional model collected according to the first visual sensor.
  • the three-dimensional data and the three-dimensional data of the first pointing device are established.
  • the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
  • the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
  • the air-to-air input device is also used to receive the first fixed point trajectory sent by the robotic arm, wherein the first fixed point trajectory is the second fixed point controlled by the robotic arm. Obtained by the equipment performing three-dimensional fixed-point movement in the air;
  • the air-to-air input device is also used to receive the first attitude information of the second pointing device, wherein the first attitude information is the robot arm controlling the second pointing device to perform fixed-point movement in the air.
  • the air-to-air input device is also used to train a neural network through the first attitude information and the first fixed point trajectory to obtain the mapping relationship.
  • the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera.
  • the fourth aspect provides an air-to-air input device, including:
  • the starting point determination unit is used to determine the fixed point starting point in the display area
  • a trajectory determination unit configured to determine the position and orientation information of the first pointing device and the first pointing device in the first pointing device according to the position and orientation information collected by the first visual sensor during the three-dimensional pointing movement of the first pointing device in the air.
  • the position and orientation information of the first pointing device collected by an IMU and the mapping relationship between the first position information and the first fixed point trajectory determine the fixed point trajectory of the first pointing device in the air, wherein,
  • the first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the posture information of the second pointing device collected by the second IMU;
  • a trace determination unit is configured to determine fixed-point traces in the display area according to the fixed-point starting point and the fixed-point trajectory.
  • the pose information of the first pointing device collected by the first visual sensor includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, and, the first The rotation angle of the wrist part in the three-dimensional model of the pointing device relative to the user's three-dimensional model.
  • the three-dimensional model of the user and the three-dimensional model of the first pointing device are the user's three-dimensional model collected according to the first visual sensor.
  • the three-dimensional data and the three-dimensional data of the first pointing device are established.
  • the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
  • the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
  • the device further includes a training unit configured to receive the first fixed-point trajectory sent by the robotic arm, where the first fixed-point trajectory is controlled by the robotic arm.
  • the second fixed-point device is obtained by performing three-dimensional fixed-point movement in the air; receiving the first attitude information, training the neural network through the first attitude information and the first fixed-point trajectory, and obtaining the mapping relationship .
  • the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera.
  • a computer-readable storage medium which is characterized in that it includes instructions that, when the instructions are run on an air-to-air input device, cause the air-to-air input device to execute any one of the first aspects. method described.
  • Figure 1 is a schematic diagram of an air-to-air input scenario involved in this application
  • FIG. 2 is a schematic structural diagram of an air-to-air input system involved in this application.
  • Figure 3 is a schematic structural diagram of a smart screen provided by this application.
  • Figure 4 is a schematic structural diagram of a pointing device provided by this application.
  • FIG. 5 is a schematic structural diagram of an IMU provided by this application.
  • Figure 6 is a schematic structural diagram of an air-to-air input system provided by this application.
  • Figure 7 is a schematic structural diagram of a projection device provided by this application.
  • Figure 8 is a schematic flow chart of an air-to-air input method provided by this application.
  • Figure 9 is a schematic diagram of the pose information of the fixed-point device collected by the IMU provided by this application.
  • Figure 10 is a schematic diagram of determining the starting point of a fixed point based on the eye part of the user's three-dimensional model provided by this application;
  • Figure 11 is a schematic diagram of a trajectory prediction model provided by this application.
  • Figure 12 shows a schematic structural diagram of an air-to-air input device provided by this application.
  • Figure 13 shows a schematic structural diagram of an air-to-air input device provided by this application.
  • FIG 2 is a schematic structural diagram of an air-to-air input system provided by the present application.
  • the provided air-to-air input system includes: a smart screen 110, a pointing device 120, and a visual sensor 130.
  • the smart screen 110 may include: a processor 112, a memory 113, a wireless communication module 114, a power switch 115, a wired local area network (LAN) communication module 116, and a high definition multimedia interface.
  • HDMI communication module 117, universal serial bus (universal serial bus, USB) communication module 118 and display 119. in:
  • Processor 112 may be used to read and execute computer-readable instructions.
  • the processor 112 may mainly include a controller, arithmetic unit, and a register.
  • the controller is mainly responsible for decoding instructions and issuing control signals for operations corresponding to the instructions.
  • the arithmetic unit is mainly responsible for performing fixed-point or floating-point arithmetic operations, shift operations, and logical operations. It can also perform address operations and conversions.
  • Registers are mainly responsible for storing register operands and intermediate operation results temporarily stored during instruction execution.
  • processor 112 may be used to parse signals received by wireless communication module 114 and/or wired LAN communication module 116 .
  • the processor 112 may be configured to perform corresponding processing operations according to the parsing results, such as generating a detection response, driving the display 119 to perform display according to the display request or display instruction, and so on.
  • the processor 112 can also be used to generate signals sent externally by the wireless communication module 114 and/or the wired LAN communication module 116, such as Bluetooth broadcast signals, beacon signals, or signals sent to electronic devices. Feedback the signal of display status (such as display success, display failure, etc.).
  • Memory 113 is coupled to processor 112 for storing various software programs and/or sets of instructions.
  • the memory 113 may include high-speed random access memory, and may also include non-volatile memory, such as one or more disk storage devices, flash memory devices or other non-volatile solid-state storage devices.
  • the memory 113 can store operating systems, such as uCOS, VxWorks, RTLinux and other embedded operating systems.
  • Memory 113 may also store communications programs that may be used to communicate with one or more servers, or additional devices.
  • the wireless communication module 114 may include one or more of a Bluetooth (bluetooth) communication module 114A and a wireless local area networks (WLAN) communication module 114B.
  • a Bluetooth (bluetooth) communication module 114A and a wireless local area networks (WLAN) communication module 114B.
  • WLAN wireless local area networks
  • one or more of the Bluetooth (BT) communication module 114A and the WLAN communication module 114B can monitor signals transmitted by other devices, such as detection requests, scanning signals, etc., and can send response signals, such as The detection response, scanning response, etc. enable other devices to discover the smart screen 110, establish wireless communication connections with other devices, and communicate with other devices through one or more wireless communication technologies in Bluetooth or WLAN.
  • BT Bluetooth
  • WLAN wireless local area network
  • one or more of the Bluetooth communication module 114A and the WLAN communication module 114B can also transmit signals, such as broadcasting Bluetooth signals and beacon signals, so that other devices can discover the smart screen 110 and communicate with other devices. Establish a wireless communication connection to communicate with other devices through one or more wireless communication technologies such as Bluetooth or WLAN.
  • the wireless communication module 114 may also include: Bluetooth, WLAN, near field communication (NFC), ultra wide band (UWB), infrared, and so on.
  • the power switch 115 can be used to control the power supply to the smart screen 110 .
  • the wired LAN communication module 116 can be used to communicate with other devices in the same LAN through the wired LAN, and can also be used to connect to the wide area network through the wired LAN and communicate with devices in the wide area network.
  • HDMI communication module 117 may be used to communicate with other devices through an HDMI interface (not shown).
  • USB communication module 118 may be used to communicate with other devices through a USB interface (not shown).
  • Display 119 may be used to display images, videos, etc.
  • the display 119 can be a liquid crystal display (LCD), an organic light-emitting diode (OLED) display, or an active matrix display.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • AMOLED Active-matrix organic light emitting diode
  • FLED flexible light-emitting diode
  • QLED quantum dot light emitting diode
  • the smart screen 110 may also include an audio module (not shown).
  • the audio module can be used to output audio signals through the audio output interface, so that the smart screen 110 supports audio playback.
  • the audio module can also be used to receive audio data through the audio input interface.
  • the smart screen 110 can be a media playback device such as a television.
  • the smart screen 110 may also include a serial interface such as an RS-232 interface.
  • the serial interface can be connected to other devices, such as speakers and other audio external amplifiers, so that the smart screen 110 and the audio external amplifiers can cooperate to play audio and video.
  • the structure illustrated in Figure 3 does not constitute a specific limitation on the smart screen 110.
  • the smart screen 110 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange different components.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the pointing device 120 may be a drawing device provided with an IMU.
  • the pointing device can be a pen-shaped device or a device of other shapes. Pointing devices can be pens, controllers, etc.
  • the pointing device 120 may include: a processor 122, a memory 123, a wireless communication module 124, a charging management module 125, a USB interface 126, a battery 127, a power management module 128 and an IMU 129.
  • Processor 122 may be used to read and execute computer-readable instructions.
  • the processor 122 may mainly include a controller, arithmetic unit, and a register.
  • the controller is mainly responsible for decoding instructions and issuing control signals for operations corresponding to the instructions.
  • the arithmetic unit is mainly responsible for performing fixed-point or floating-point arithmetic operations, shift operations, and logical operations. It can also perform address operations and conversions.
  • Registers are mainly responsible for storing register operands and intermediate operation results temporarily stored during instruction execution.
  • the processor 122 may also adopt a heterogeneous architecture, such as an ARM+DSP architecture, an ARM+ASIC architecture, an ARM+AI chip architecture, and so on.
  • Memory 123 is coupled to processor 122 for storing various software programs and/or sets of instructions.
  • the memory 123 may include high-speed random access memory, and may also include non-volatile memory, such as one or more disk storage devices, flash memory devices or other non-volatile solid-state storage devices.
  • the memory 123 can store operating systems, such as uCOS, VxWorks, RTLinux and other embedded operating systems.
  • the wireless communication module 124 may include one or more of a Bluetooth (BT) communication module, a WLAN communication module, near field communication (NFC), ultra wide band (UWB), infrared, and the like.
  • BT Bluetooth
  • WLAN wireless local area network
  • NFC near field communication
  • UWB ultra wide band
  • infrared and the like.
  • the charge management module 125 is used to receive charging input from the charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 125 may receive charging input from the wired charger through the USB interface 126 .
  • the charging management module 125 may receive wireless charging input through a wireless charging coil. While the charging management module 125 charges the battery 127, it can also provide power to the pointing device through the power management module 128.
  • the power management module 128 is used to connect the battery 127, the charging management module 125 and the processor 122.
  • the power management module 128 receives input from the battery 127 and/or the charging management module 125 and supplies power to the processor 122, the memory 123, the wireless communication module 124, the IMU 129, and the like.
  • the power management module 128 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters.
  • the power management module 128 may also be provided in the processor 122 .
  • the power management module 128 and the charging management module 125 can also be provided in the same device.
  • IMU 129 is a device that measures the three-axis angular velocity and acceleration of an object.
  • An IMU may be equipped with a three-axis gyroscope and a three-axis accelerometer to measure the angular velocity and acceleration of an object in three-dimensional space.
  • an IMU only provides users with three-axis angular velocity and three-axis acceleration data.
  • the vertical reference unit (VRU) is based on the IMU, uses the gravity vector as a reference, and uses algorithms such as Kalman or complementary filtering to provide users with pitch angles, roll angles referenced by the gravity vector, and angles without reference standards. Heading.
  • the so-called 6-axis attitude module belongs to this type of system. There is no reference for the heading angle.
  • the heading angle will be 0° (or a set constant) after startup. As the working time of the module increases, the heading angle will slowly accumulate errors. Since the pitch angle and roll angle have a gravity vector reference, there will be no cumulative error for a long time under low maneuvering conditions.
  • the attitude and heading reference system (AHRS) system adds a magnetometer or optical flow sensor to the VRU, and uses algorithms such as Kalman or complementary filtering to provide users with absolute reference pitch angles and roll angles. As well as heading angle equipment, this type of system is used to provide accurate and reliable attitude and navigation information for the aircraft.
  • the 9-axis attitude sensor we usually call belongs to this type of system. Because the heading angle has a reference to the geomagnetic field, it will not drift.
  • accelerometers, gyroscopes, magnetometers, etc. are used as basic inputs, and the attitude information, position information and speed required by the user are output through the data acquisition unit, calibration and compensation unit, data fusion unit, output and configuration unit information.
  • the structure illustrated in FIG. 4 does not constitute a specific limitation on the pointing device 120 .
  • the pointing device 120 may include more or fewer components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the visual sensor 130 may be a monocular camera, a stereo camera, a depth camera or a laser radar that has been trained to acquire three-dimensional data, and is not specifically limited here.
  • the visual sensor may be integrated with the air-to-air input device, or may be provided separately, which is not specifically limited here.
  • the structure illustrated in Figure 2 does not constitute a specific limitation on the air-to-air input system.
  • the air-to-air input system may include more or less components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • Figure 6 is a schematic structural diagram of an air-to-air input system provided by the present application.
  • the air-to-air input system provided by this application includes: a projector 210 with a function of determining fixed-point traces, a pointing device 220, and a visual sensor 230.
  • the projector 210 with fixed-point trace determination function may include: a processor 212, a memory 213, a wireless communication module 214, a power switch 215, a wired LAN communication module 216, an HDMI communication module 217, and a light source controller 218. and image projector 219. in:
  • Processor 212 may be used to read and execute computer-readable instructions.
  • the processor 212 may mainly include a controller, arithmetic unit, and a register.
  • the controller is mainly responsible for decoding instructions and issuing control signals for operations corresponding to the instructions.
  • the arithmetic unit is mainly responsible for performing fixed-point or floating-point arithmetic operations, shift operations, and logical operations. It can also perform address operations and conversions.
  • Registers are mainly responsible for storing register operands and intermediate operation results temporarily stored during instruction execution.
  • the processor 212 may be used to parse signals received by the wireless communication module 214 and/or the wired LAN communication module 216, such as broadcast detection requests, projection requests, projection instructions sent by the server of the cloud projection service provider, etc.
  • the processor 212 may be configured to perform corresponding processing operations according to the parsing results, such as generating a detection response, or according to
  • the projection request or projection instruction drives the light source controller 218 and the image projector to perform projection operations, and so on.
  • the processor 212 can also be used to generate signals sent externally by the wireless communication module 214 and/or the wired LAN communication module 216, such as Bluetooth broadcast signals, beacon signals, or for feedback sent to electronic devices.
  • Signal of projection status (such as projection success, projection failure, etc.).
  • Memory 213 is coupled to processor 212 for storing various software programs and/or sets of instructions.
  • the memory 213 may include high-speed random access memory, and may also include non-volatile memory, such as one or more disk storage devices, flash memory devices or other non-volatile solid-state storage devices.
  • the memory 213 can store operating systems, such as uCOS, VxWorks, RTLinux and other embedded operating systems.
  • Memory 213 may also store communications programs that may be used to communicate with one or more servers, or additional devices.
  • the wireless communication module 214 may include one or more of a Bluetooth (BT) communication module 214A and a WLAN communication module 214B.
  • BT Bluetooth
  • WLAN WLAN
  • one or more of the Bluetooth (BT) communication module and the WLAN communication module can monitor signals transmitted by other devices, such as detection requests, scanning signals, etc., and can send response signals, such as detection responses. , scanning response, etc., so that other devices can discover the projector 210, establish wireless communication connections with other devices, and communicate with other devices through one or more wireless communication technologies in Bluetooth or WLAN.
  • BT Bluetooth
  • WLAN wireless communication
  • one or more of the Bluetooth (BT) communication module and the WLAN communication module can also transmit signals, such as broadcasting Bluetooth signals and beacon signals, so that other devices can discover the projector 210 and communicate with other devices.
  • the device establishes a wireless communication connection and communicates with other devices through one or more wireless communication technologies such as Bluetooth or WLAN.
  • Wireless communication module 214 may also include a cellular mobile communication module (not shown).
  • the cellular mobile communication module can communicate with other devices (such as servers) through cellular mobile communication technology.
  • the power switch 215 may be used to control power supply to the projector 210 .
  • the wired LAN communication module 216 can be used to communicate with other devices in the same LAN through the wired LAN, and can also be used to connect to the wide area network through the wired LAN and communicate with devices in the wide area network.
  • HDMI communication module 217 may be used to communicate with other devices through an HDMI interface (not shown).
  • the image projector 219 may have a light source (not shown), may modulate light emitted from the light source according to image data and project an image on the screen.
  • the light source controller 218 may be used to control lighting of the light source provided by the image projector 219 .
  • the structure illustrated in FIG. 6 does not constitute a specific limitation on the projector 210 with the function of determining fixed point traces.
  • the projector 210 with fixed point trace determination function may include more or less components than shown in the figure, or combine some components, or split some components, or arrange different components.
  • the components illustrated may be implemented in hardware, software, or a combination of software and hardware.
  • the pointing device 220 and the visual sensor 230 may refer to the pointing device 120 and the visual sensor 130 shown in FIG. 2 .
  • the fixed-point trace determination function can be implemented by a peripheral fixed-point trace determination device without being integrated with the projector.
  • the fixed-point trace determination function includes receiving the position and orientation information of the first pointing device collected by the first visual sensor and the position and orientation information of the first pointing device collected by the first IMU, and based on the first fixed-pointing device collected by the first visual sensor.
  • the posture information of the pointing device and the first IMU collect the posture information of the first pointing device to determine the pointing trace.
  • a projector can be used to display the fixed point traces in the projection area.
  • FIG 8 is a schematic flow chart of an air-to-air input method provided by the present application.
  • this application Provided air-to-air input methods include:
  • the first pointing device performs three-dimensional pointing movement in the air, and collects the pose information of the first pointing device through the first IMU in the first pointing device.
  • the first pointing device may be a pointing device in the air-to-air input system shown in FIG. 2 or FIG. 6 . Please refer to Figure 2 or Figure 6 and related descriptions for the specific structure of the first pointing device, and will not be described in detail here.
  • the pose information of the first pointing device collected by the first IMU includes: the three-dimensional coordinates of the first pointing device and the angle of the first pointing device.
  • the first pointing device measures data through the gyroscope, accelerometer and magnetometer in the first IMU, and inputs the measured data into the data acquisition unit, calibration and compensation unit, data fusion unit and output and
  • the configuration unit performs acquisition, calibration and compensation, data fusion and output configuration to obtain the three-dimensional coordinates of the first pointing device (position information in Figure 5) and the angle of the first pointing device (attitude information in Figure 5).
  • the pose information of the first pointing device collected by the first IMU is (x_pen, y_pen, z_pen, yaw, roll, pitch), where x_pen is the coordinate value of the first pointing device on the x-axis relative to the starting time, y_pen is the coordinate value of the first fixed point device on the y axis relative to the starting time, z_pen is the coordinate value of the first fixed point device on the z axis relative to the starting time, yaw is the coordinate value of the first fixed point device relative to the starting time is the pitch angle, roll is the roll angle of the first pointing device relative to the starting time, and pitch is the heading angle of the first pointing device relative to the starting time.
  • the original data collected by the first IMU are the acceleration of the x-axis, y-axis and z-axis and the angular velocity of the x-axis, y-axis and z-axis.
  • the pose information x_pen is obtained by integrating the acceleration of the x-axis twice.
  • the pose information y_pen is obtained by integrating the acceleration of the y-axis twice.
  • the pose information z_pen is obtained by integrating the acceleration of the z-axis twice.
  • the pose information yaw is obtained by integrating the angular velocity of the x-axis once.
  • the pose information roll is obtained by integrating the angular velocity of the y-axis once
  • the pose information pitchl is obtained by integrating the angular velocity of the z-axis. Because the acceleration of the x-axis, y-axis, and z-axis and the angular velocity of the x-axis, y-axis, and z-axis collected by the first IMU all have systematic errors and random errors, and the integration process will amplify these errors, resulting in the obtained pose information. The error is relatively large.
  • the first visual sensor collects the pose information of the first pointing device.
  • the first visual sensor may be the visual sensor in the air-to-air input system shown in FIG. 2 or FIG. 6 . Please refer to Figure 2 or Figure 6 and related descriptions for the specific structure of the first visual sensor, which will not be described again here.
  • the first visual sensor may be disposed in the display area or may be disposed outside the display area.
  • the angle between the line of sight axis of the first visual sensor and the normal line of the display area is zero. In this case, the complexity of subsequent calculations can be simplified.
  • the first visual sensor is set outside the display area, the degree of freedom in setting the first visual sensor can be increased.
  • it can also be set in the display area. outside the display area.
  • the first visual sensor collects the three-dimensional data of the user and the three-dimensional data of the first pointing device. At this time, the three-dimensional data of the user and the three-dimensional data of the first pointing device are collected based on the first visual sensor.
  • the viewpoint of the sensor is used as the origin of the coordinates, and the coordinate transformation relationship is needed to convert the user's three-dimensional data with the visual center of the first vision sensor as the coordinate origin and the three-dimensional data of the first pointing device into a coordinate origin with the center of the display area.
  • the three-dimensional data of the user and the three-dimensional data of the first pointing device are used as the origin of the coordinates, and the coordinate transformation relationship is needed to convert the user's three-dimensional data with the visual center of the first vision sensor as the coordinate origin and the three-dimensional data of the first pointing device into a coordinate origin with the center of the display area.
  • a three-dimensional model of the user is established based on the converted three-dimensional data of the user, and a three-dimensional model of the first pointing device is established based on the converted three-dimensional data of the first pointing device. Then, determine the pose information of the first pointing device according to the three-dimensional model of the user and the three-dimensional model of the first pointing device.
  • First vision sensor collection The pose information of the first pointing device includes the three-dimensional coordinates of the point in the three-dimensional model of the first pointing device, and the three-dimensional model of the first pointing device is relative to the user's wrist in the three-dimensional model. The rotation angle of the part.
  • the first visual sensor When the first visual sensor is installed in the smart screen, that is, when the first visual sensor is integrated in the smart screen, the positional relationship between the first visual sensor and the center of the display area of the smart screen will not change. Therefore, , it can be calibrated in advance before leaving the factory to determine the coordinate transformation relationship between the visual center of the first visual sensor and the center of the display area. This coordinate transformation relationship will not change after leaving the factory.
  • the first visual sensor When the first visual sensor is set outside the smart screen, that is, the first visual sensor is an external device of the smart screen, one or several placement positions can be set on the outer frame of the smart screen.
  • the one or several placement positions and The positional relationship between the centers of the display areas of the smart screen will not change. Therefore, it can be calibrated in advance before leaving the factory to determine the distance between the first visual sensor set at one or several placement positions and the display area.
  • the coordinate transformation relationship between centers will not change after leaving the factory.
  • the first visual sensor is set at any position that can capture the display area of the smart screen, and then the calibration image is played in the display area of the smart screen, and the calibration is performed based on the calibration image to determine the visual center of the first visual sensor. Displays the coordinate transformation relationship between the centers of the areas.
  • the first visual sensor When the first visual sensor is set outside the projector with fixed-point trace determination function, one or several placement positions can be set by formulating the projection boundary.
  • the first visual sensor can be set at the upper left of the projection boundary. corner or upper right corner.
  • the relative positional relationship between the one or several placement positions and the center of the display area of the projector is clear. Therefore, it can be determined between the first visual sensor disposed at the one or several placement positions and the center of the display area. coordinate transformation relationship between them.
  • the first visual sensor is set at any position that can capture the display area of the projector, and then the calibration image is played in the display area of the projector, and calibration is performed based on the calibration image to determine the visual center of the first visual sensor. Displays the coordinate transformation relationship between the centers of the areas. It can be understood that when the projector with the fixed-point trace determination function is replaced by a combination of the fixed-point trace determination device and the projector, the calibration method of the coordinate transformation relationship is also similar, and will not be described here.
  • the pose information of the first pointing device collected by the first vision sensor is (x_tip, y_tip, z_tip, ⁇ _x, ⁇ _y, ⁇ _z), where, x_tip is the coordinate value of the point in the three-dimensional model of the first pointing device relative to the coordinate origin (0, 0, 0) on the x-axis, y_tip is the point in the three-dimensional model of the first pointing device relative to the coordinate origin (0, 0, 0) is the coordinate value on the y-axis, z_tip is the coordinate value of the point in the three-dimensional model of the first pointing device relative to the coordinate origin (0, 0, 0) on the z-axis, ⁇ _x is the coordinate value of the first pointing device The pitch angle of the three-dimensional model relative to the user's wrist in the three-dimensional model, ⁇ _y is the roll angle of the three-dimensional model of the first pointing device relative to the wrist in
  • the wrist part in the first user's three-dimensional model can be extracted by inputting the first user's three-dimensional model into an extraction model, where the extraction model can be a deep learning (DL) network. , convolutional neural networks (CNN) and so on.
  • the input extraction model may be trained using a large number of three-dimensional models of the second user and wrist parts of the three-dimensional model of the second user.
  • the three-dimensional model of the first user is collected by the first visual sensor
  • the three-dimensional model of the second user can be collected by the second visual sensor
  • the three-dimensional model of the second user can be manually annotated.
  • the first visual sensor and the second visual sensor may be the same visual sensor, or they may be two different visual sensors.
  • the first visual sensor and the second visual sensor may be the same stereo camera, or they may be two different visual sensors. different stereo cameras.
  • the first visual sensor can be a stereo camera
  • the second visual sensor a lidar, and so on.
  • the first IMU and the second IMU can So the same visual sensor can also be two different visual sensors.
  • the rotation angle of the three-dimensional model of the first pointing device relative to the wrist in the three-dimensional model of the user can be determined in the following manner: converting the points in the three-dimensional model of the first pointing device Connect a line with the point in the wrist of the user's three-dimensional model, and then find the rotation angle of the connecting line relative to the coordinate system with the point in the wrist as the coordinate origin.
  • the point in the three-dimensional model of the first pointing device may be any point in the three-dimensional model of the first pointing device, for example, it may be a vertex of the three-dimensional model of the first pointing device (for example, , the pen tip in the three-dimensional model of the writing pen), the center of mass of the three-dimensional model of the first pointing device and the end point of the three-dimensional model of the first pointing device, and so on.
  • the pen tip in the three-dimensional model of the first pointing device is taken as an example for explanation.
  • the point in the wrist part of the user's three-dimensional model can be any point in the wrist part of the user's three-dimensional model.
  • it can be the center point of the wrist part in the user's three-dimensional model.
  • the center point of the wrist part in the user's three-dimensional model may be obtained by averaging the coordinate values of each point of the wrist part in the user's three-dimensional model.
  • the point in the user's wrist in the three-dimensional model is used as one point for explanation. In other embodiments, it can also be two points, three points or even more points, which are not specifically limited here.
  • the first visual sensor can provide relatively accurate absolute pose information relative to the center point of the display area, but the first visual sensor is usually unable to provide continuous absolute pose information. Therefore, in the embodiment of the present invention, a first IMU is introduced, which can continuously provide relatively accurate relative pose information relative to the starting position. Through the cooperation of the two sensors, it can not only provide continuous pose information, but also avoid the systematic errors and random errors caused by simply using the IMU, and improve the accuracy of determining the fixed-point trajectory in the air.
  • the first pointing device sends the pose information of the first pointing device collected by the first IMU to the air-to-air input device.
  • the air-to-air input device receives the pose information of the first pointing device collected by the first IMU and sent by the first pointing device.
  • the first pointing device is provided with one or more of a wireless communication module or a USB interface.
  • the air-to-air input device is also provided with one or more of a wireless communication module and a USB communication module.
  • the first pointing device and the air-to-air input device communicate in a wireless manner, there is no data connection line between the first pointing device and the air-to-air input device, which can make the use of the first pointing device more convenient.
  • the first pointing device and the air-to-air input device communicate using USB, the data communication between the first pointing device and the air-to-air input device can be made smoother.
  • the first pointing device may be provided with one or more of a wireless communication module or a USB interface.
  • S104 The pose information of the first pointing device collected by the first visual sensor is sent to the air-to-air input device.
  • the air-to-air input device receives the position and orientation information of the first pointing device collected by the visual sensor and sent by the first visual sensor.
  • the first visual sensor is provided with one or more of a wireless communication module, a USB interface, a wired LAN communication module, and an HDMI communication module.
  • the air-to-air input device is also provided with a wireless communication module and a USB communication module.
  • the first visual sensor and the air-to-air input device communicate in a wireless manner, there is no data connection line between the first visual sensor and the air-to-air input device, which can make the use of the first visual sensor more convenient.
  • the first vision sensor and the air-to-air input device communicate using USB, wired LAN or HDMI, the data communication between the first vision sensor and the air-to-air input device can be made smoother.
  • the air-to-air input device determines the starting point of the fixed point in the display area.
  • the air-to-air input device may be the smart screen in Figure 2 or the projector with fixed-point trace determination function in Figure 6, or an air-to-air input device composed of a fixed-point trace determination device and a projector. etc. Please refer to Figure 2 or Figure 6 and related descriptions for the specific structure of the air-to-air input device, which will not be described again here.
  • the display area may be an area capable of displaying a fixed-point trajectory, for example, it may be a displayable area of a smart screen, or a projection area of a projector, or the like.
  • the air-to-air input device determines the starting point of the fixed point in the display area, specifically: the processor of the air-to-air input device obtains the user's three-dimensional image collected from the first visual sensor from the memory of the air-to-air input device. Data, three-dimensional data of the first pointing device. The processor of the air-to-air input device respectively establishes a three-dimensional model of the user and a three-dimensional model of the first pointing device based on the three-dimensional data of the user and the three-dimensional data of the first pointing device.
  • the processor of the air-to-air input device determines the intersection of the normal vector of the specific part in the user's three-dimensional model or the three-dimensional model of the first pointing device and the display area as the starting point of the fixed point.
  • the specific part of the user's three-dimensional model may be any part of the user's three-dimensional model, for example, the eye part, the nose tip part, the finger tip part, etc.
  • the priority of specific parts of the user's three-dimensional model can be set based on different conditions. For example, when the user points to the display area with a finger, the tip of the finger is used first to determine the starting point of the fixed point.
  • the specific part in the three-dimensional model of the first pointing device may be any part in the three-dimensional model of the first pointing device, for example, an endpoint part, a center part, etc.
  • the point in the three-dimensional model of the first pointing device may be the tip of the writing pen. The following will be described with reference to the specific embodiment in FIG. 10 .
  • the starting point of the fixed point is The intersection point (x, y, 0) of the normal vector (x_n, y_n, z_n) of the user's eye part in the user's three-dimensional model and the smart screen.
  • the starting point of the fixed point is determined based on the pose information of the IMU in the first pointing device.
  • the starting point of the fixed point determined based on the pose information of the IMU is often not the starting point of the fixed point desired by the user. Users often need to adjust the pointing device multiple times to reach the starting point they want, resulting in very low efficiency.
  • visual sensors are used to determine the part where people's eyes are looking at to determine the drawing, which can more accurately and efficiently determine the starting point of the fixed point that the user wants.
  • the starting point of the fixed point is located. , it is very convenient and user-friendly to use.
  • the user's three-dimensional model established by the three-dimensional data collected by the visual sensor or the three-dimensional model of the first pointing device is used to determine the accuracy of the starting point of the fixed point. Also very high.
  • the airborne input device determines the fixed-point trajectory of the first pointing device in the air.
  • the air-to-air input device determines the fixed-point trajectory of the first pointing device in the air, specifically: during the movement of the first pointing device in the air, the processor of the air-to-air input device determines the fixed-point trajectory of the first pointing device in the air according to the first
  • the posture information of the first pointing device collected by a visual sensor, the posture information of the first pointing device collected by the IMU in the first pointing device, and the first posture information is consistent with the first pointing device.
  • the mapping relationship between point trajectories determines the fixed-point trajectory of the first pointing device in the air.
  • the first posture information in the mapping relationship includes the posture information collected by the second visual sensor and the posture information collected by the second IMU.
  • the processor of the air-to-air input device uses the posture information of the first pointing device collected by the first visual sensor and all the information collected by the first IMU in the first pointing device.
  • the posture information and trajectory prediction model of the first pointing device determine the fixed-point trajectory of the first pointing device in the air.
  • the trajectory prediction model is trained based on the first posture information collected by the second vision sensor, the posture information collected by the second IMU, and the first fixed point trajectory. Arrived.
  • the trajectory prediction model can be (deep neural networks, DNN), linear regression model, etc.
  • y is the fixed-point trajectory of the first pointing device in the air
  • state 1 is the pose information of the first pointing device collected by the first visual sensor
  • state 2 is the position and orientation information of the first pointing device collected by the first IMU.
  • Pose information, f() is the mapping relationship.
  • state 1 is the pose information of the first pointing device collected by the first visual sensor (x_tip, y_tip, z_tip, ⁇ _x, ⁇ _y, ⁇ _z)
  • tats 2 is the position information of the first pointing device collected by the first IMU.
  • the pose information of the device at a certain point is (x_pen, y_pen, z_pen, yaw, roll, pitch).
  • the trajectory prediction model may include an input layer, a hidden layer and an output layer.
  • the inputs of the input layer are S 1 and S 2 , where S 1 is the pose information of the first pointing device collected by the first visual sensor, and S 2 is the position of the first pointing device collected by the first IMU. pose information, the output and input are equal, that is, no processing is performed on the input.
  • S 1 is the pose information of the first pointing device collected by the first visual sensor
  • S 2 is the position of the first pointing device collected by the first IMU. pose information
  • the output and input are equal, that is, no processing is performed on the input.
  • the input layer does not perform any processing.
  • the input layer can be normalized and processed without specific limitations here.
  • the weight vector of , W 2 l is the weight vector of Z 2 l of the l-th layer
  • b l is the bias vector of the l-th layer
  • a l+1 is the intermediate vector of the l+1-th layer
  • Z l+1 is the hidden layer result of the l+1th layer.
  • the first excitation function and the second excitation function may be any one of sigmoid function, hyperbolic tangent function, Relu function, etc.
  • W 2 l is the weight vector of Z 2 l of the l-th layer
  • b l is the bias vector of the l-th layer
  • the trajectory prediction model before using the trajectory prediction model, needs to be trained first.
  • the specific process of training the trajectory prediction model is to obtain a large amount of first attitude information and the corresponding first fixed point trajectory.
  • the first posture information includes the posture information of the second pointing device collected by the second visual sensor, the posture information of the second pointing device collected by the second IMU, and the corresponding first pointing trajectory.
  • the pose information of the second fixed-point device collected by the second visual sensor and the second position and orientation information collected by the second IMU can be combined.
  • the pose information of the fixed-point device is input into the trajectory prediction model to obtain the predicted value of the data, and the first fixed-point trajectory is used as the real desired target value, which is more accurate.
  • the weight vector of each layer of the deep neural network in the trajectory prediction model is updated based on the difference between the previous predicted value and the really desired target value (of course, there is usually a
  • the initialization process is to pre-configure parameters for each layer in the trajectory prediction model).
  • the prediction value of the trajectory prediction model is high, adjust the weight vector to make it predict a lower value.
  • the trajectory prediction model can predict Find the target value you really want. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value".
  • This is the loss function (loss function) or objective function (objective function), which is used to measure the difference between the predicted value and the target value.
  • the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference. Then the training of the trajectory prediction model becomes a process of reducing this loss as much as possible.
  • the first visual sensor and the second visual sensor may be the same visual sensor, or they may be two different visual sensors.
  • the first visual sensor and the second visual sensor may be the same stereo camera, or they may be two different visual sensors. different stereo cameras.
  • the first visual sensor can be a stereo camera, the second visual sensor a lidar, and so on.
  • the first pointing device and the second pointing device may be the same device, or may not be the same device.
  • the first IMU and the second IMU are the same IMU.
  • the first The IMU and the second IMU are two different IMUs.
  • the first visual sensor and the second visual sensor are the same visual sensor, but the first pointing device and the second pointing device are different pointing devices.
  • the first visual sensor and the second visual sensor are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device.
  • the first visual sensor and the second visual sensor are different visual sensors, but the first pointing device and the second pointing device are the same pointing device.
  • Obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory may occur at the same time or not at the same time.
  • mapping The relationship can continue to change to continuously improve the accuracy of the mapping relationship in the process of using the mapping relationship.
  • the posture information of the first positioning device collected by the first visual sensor and the posture information of the first positioning device collected by the first IMU can be used as the third positioning device when determining the fixed-point trajectory using the mapping relationship.
  • the fixed-point trajectory determined using the mapping relationship can be used as the first fixed-point trajectory to train the trajectory prediction model in the air-to-air input device.
  • the mapping relationship can be obtained through training first, and then the mapping relationship is used to determine the fixed-point trajectory. That is, obtaining the mapping relationship through training occurs in the past time and space, while using the mapping relationship to determine the fixed-point trajectory occurs in the present time and space.
  • a specialized training device can be tasked with training to obtain the mapping relationship to reduce the load on the air-to-air input device.
  • the mapping relationship can still be obtained by training through the air input device. In other words, the mapping relationship is obtained through training using historically collected data.
  • a robotic arm can be used to control the second pointing device to perform fixed-point movement in the air. That is, a robotic arm can be used to simulate a human arm's three-dimensional fixed-point movement in the air. In order to better simulate the human arm, the robotic arm can use a bionic arm that is comparable to a human arm.
  • the trajectory obtained by the robot arm controlling the second fixed-point device to perform three-dimensional fixed-point movement in the air is the first fixed-point trajectory.
  • the second visual sensor collects the three-dimensional data of the robotic arm and the three-dimensional data of the second fixed-point device, and establishes a three-dimensional model of the robotic arm based on the three-dimensional data of the robotic arm.
  • the pose information of the second pointing device collected by the second visual sensor and the pose information of the second pointing device collected by the second IMU in the second pointing device are collectively referred to as the first pose information.
  • the robotic arm sends the first fixed-point trajectory to the air-to-air input device or training device
  • the second visual sensor sends the pose information of the second fixed-point device collected by the second visual sensor to the air-to-air input device or training device.
  • the pointing device sends the pose information of the second pointing device collected by the second IMU in the second pointing device to the air-to-air input device or the training device to train the trajectory prediction model.
  • the positions of the first visual sensor and the second visual sensor remain unchanged.
  • the first visual sensor can be placed in the middle of the upper edge of the smart screen, and the second visual sensor can also be placed. In the middle of the upper edge of the smart screen.
  • the positions of the first visual sensor and the second visual sensor may also change.
  • the first visual sensor can be disposed in the center of the upper edge of the smart screen, and the second visual sensor can be disposed in the upper left corner of the smart screen.
  • the three-dimensional data of the fixed-point device is converted into the three-dimensional data of the robot arm with the center of the display area as the coordinate origin and the three-dimensional data of the second fixed-point device.
  • the first visual sensor and the second visual sensor When obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory occur at the same time, the first visual sensor and the second visual sensor often used are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device. .
  • the first visual sensor and the second visual sensor may be the same visual sensor, but the first pointing device and the second pointing device may be different. Pointing device. Alternatively, the first visual sensor and the second visual sensor are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device.
  • first visual sensor and the second visual sensor are different visual sensors, but the first pointing device and the second pointing device are the same pointing device.
  • first visual sensor and the second visual sensor are different visual sensors, and the first pointing device and the second pointing device are also different pointing devices.
  • the air-to-air input device determines the fixed-point trace in the display area based on the fixed-point starting point and the fixed-point trajectory.
  • the processor of the air-to-air input device determines the fixed-point trace in the display area based on the fixed-point starting point and the fixed-point trajectory.
  • the air-to-air input device displays fixed-point traces in the display area.
  • fixed-point traces can be displayed in the display area of the display of the smart screen; when the air-to-air input device is a projector with a fixed-point trace determination function or a fixed-point trace When a device with a trace determination function and an air-to-air input device composed of a projector are used, fixed-point traces can be projected to the projection area through the image projector of the projector.
  • the air-to-air input device can receive the posture information of the first pointing device collected by the first visual sensor and the first IMU collects the posture information of the first pointing device, and performs the processing according to the first
  • the position and orientation information of the first pointing device collected by the visual sensor and the position and orientation information of the first pointing device collected by the first IMU determine the fixed point trace, and project
  • the instrument can be used to display the fixed-point traces in the projection area, and the steps of displaying the fixed-point traces in the display area are explained as examples.
  • the air-to-air input device may include a fixed-point trace determination device and a projector, wherein the fixed-point trace determination device is configured to receive the pose information of the first pointing device collected by the first visual sensor and the first The IMU collects the pose information of the first pointing device, and determines the fixing trace based on the pose information of the first pointing device collected by the first visual sensor and the pose information of the first pointing device collected by the first IMU.
  • the projector can be used to display the fixed point trace in the projection area, which is not specifically limited here.
  • the first visual sensor collects the three-dimensional data of the user and the three-dimensional data of the first pointing device, and then determines the first fixed point based on the three-dimensional data of the user and the three-dimensional data of the first pointing device.
  • the first visual sensor can also collect the three-dimensional data of the user and the three-dimensional data of the first fixed point device, and the three-dimensional data of the user and the first fixed point collected by the first visual sensor can be used.
  • the three-dimensional data of the device is sent to the air-to-air input device, and the air-to-air input device determines the pose information of the first pointing device based on the user's three-dimensional data and the three-dimensional data of the first pointing device.
  • the second vision sensor can also collect the three-dimensional data of the robotic arm and the three-dimensional data of the second fixed-point device and send it to the air-to-air input device or training device, and the air-to-air input device or training device can collect the three-dimensional data of the robotic arm and the three-dimensional data of the second fixed-point device.
  • the three-dimensional data of the second pointing device determines the pose information of the second pointing device.
  • steps S101 and S102 are executed in no particular order.
  • Step S101 may be executed first and then step S102, or step S102 may be executed first and then step S101, or step S101 and step S102 may be executed simultaneously.
  • the execution order of step S103 and step S104 is in no particular order.
  • Step S103 may be executed first and then step S104, or step S103 may be executed first and then step S104, or step S103 and step S104 may be executed simultaneously.
  • FIG 12 shows a schematic structural diagram of an air-to-air input device provided by the present application.
  • air-to-air input devices include:
  • the starting point determining unit 310 is used to determine a fixed point starting point in the display area.
  • the trajectory determination unit 320 is configured to determine, according to the posture information of the first pointing device collected by the first visual sensor, the third point in the first pointing device during the three-dimensional pointing movement of the first pointing device in the air.
  • the position and orientation information of the first pointing device collected by an IMU and the mapping relationship between the first position information and the first fixed point trajectory determine the fixed point trajectory of the first pointing device in the air, wherein,
  • the first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the posture information of the second pointing device collected by the second IMU.
  • the trace determining unit 330 is configured to determine the fixed point trace in the display area according to the fixed point starting point and the fixed point trajectory.
  • the starting point determination unit 310, the trajectory determination unit 320, and the trace determination unit 330 work together to implement the steps performed by the air-to-air input device in S104.
  • the starting point determining unit 310 is used to perform the step of determining the fixed point starting point in S105
  • the trajectory determining unit 320 is used to perform the step of determining the fixed point trajectory of the first pointing device in the air in S106
  • the trace determining unit 330 is used to perform The step of determining fixed-point traces in the display area in step 106 above.
  • the air-to-air input device may also include a training unit (not shown), the training unit being configured to receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is the The mechanical arm controls the second fixed-point device to perform three-dimensional fixed-point movement in the air; receives the first attitude information, and trains the neural network through the first attitude information and the first fixed-point trajectory to obtain the mapping relationship.
  • a training unit (not shown)
  • the training unit being configured to receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is the The mechanical arm controls the second fixed-point device to perform three-dimensional fixed-point movement in the air; receives the first attitude information, and trains the neural network through the first attitude information and the first fixed-point trajectory to obtain the mapping relationship.
  • the air-to-air input device may also include a receiving unit (not shown) for receiving the information collected by the first visual sensor.
  • the pose information of the first pointing device (or the user's three-dimensional data collected by the first visual sensor and the three-dimensional data of the first pointing device), and the pose information of the first pointing device collected by the first IMU.
  • it can also be used to receive the pose information of the second pointing device collected by the second visual sensor (or the three-dimensional data of the mechanical arm and the three-dimensional data of the second pointing device collected by the second visual sensor).
  • the air-to-air input device may also include a display unit (not shown) for displaying fixed-point traces in the display area.
  • the air-to-air input device may include a fixed-point trace determination device and a projector, wherein the starting point determination unit 310, the trajectory determination unit 320 and the trace determination unit 330 are provided in the trace determination device, and the display unit is provided in the projector, where There are no specific limitations.
  • the relevant introduction in step S105 in FIG. 8 For the position of the fixed point starting point determined by the starting point determining unit 310 and the method of determining the fixed point starting point, please refer to the relevant introduction in step S105 in FIG. 8 .
  • the relevant introduction in steps S101, S102 and S105 in Figure 8 please refer to the relevant introduction in steps S101, S102 and S105 in Figure 8, and will not be described here.
  • FIG 13 shows a schematic structural diagram of an air-to-air input device provided by the present application.
  • the air-to-air input device is used to perform the steps performed by the air-to-air input device in the above-mentioned air to air input method.
  • the air-to-air input device includes a memory 410 , a processor 420 , a communication interface 430 and a bus 440 .
  • the memory 410, the processor 420, and the communication interface 430 implement communication connections between each other through the bus 440.
  • the memory 410 may be a read only memory (ROM), a static storage device, a dynamic storage device or a random access memory (RAM).
  • the memory 410 may store computer instructions, such as: computer instructions in the starting point determination unit 310, computer instructions in the trajectory determination unit 320, computer instructions in the trace determination unit 330, etc.
  • the processor 420 and the communication interface 430 are used to execute part or all of the method described in the above steps S104 to S106.
  • the memory 410 can also store data, for example: intermediate data or result data generated by the processor 420 during execution, for example, the user's three-dimensional data collected by the first visual sensor, the three-dimensional data of the first pointing device, the user's three-dimensional data. Model, three-dimensional model of the first pointing device, mapping relationship, pose information collected by the first visual sensor, pose information collected by the first IMU, etc.
  • the processor 420 may be a CPU, a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more integrated circuits.
  • ASIC application specific integrated circuit
  • GPU graphics processing unit
  • the processor 420 may also be an integrated circuit chip with signal processing capabilities. During the implementation process, part or all of the functions of the air-to-air input device may be completed by instructions in the form of hardware integrated logic circuits or software in the processor 420 .
  • the processor 420 may also be a general processor, a digital signal process (DSP), a field programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, Discrete hardware components are used to implement or execute the methods, steps and logical block diagrams disclosed in the embodiments of this application.
  • DSP digital signal process
  • FPGA field programmable gate array
  • Discrete hardware components are used to implement or execute the methods, steps and logical block diagrams disclosed in the embodiments of this application.
  • the general-purpose processor can be a microprocessor or the processor can be any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented as a hardware decoding processor, or can be executed using decoding processing.
  • the combination of hardware and software modules in the device is executed.
  • the software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field.
  • the storage medium is located in the memory 410.
  • the processor 420 reads the information in the memory 410 and completes steps S105 to S108 in the above-mentioned air-to-air input method in conjunction with its hardware.
  • the communication interface 430 uses a transceiver module such as, but not limited to, a transceiver to implement communication between the computing device and other devices (eg, camera device, microphone, server).
  • a transceiver module such as, but not limited to, a transceiver to implement communication between the computing device and other devices (eg, camera device, microphone, server).
  • Bus 440 may include a path for transmitting information between various components in the air-to-air input device (eg, memory 410, processor 420, communication interface 430).
  • the air-to-air input device can be used as a terminal device for team collaboration and communication. Therefore, optionally, the air-to-air input device may also include a camera device 450 and a microphone 460 for real-time collection of image signals and sound signals. Alternatively, the air-to-air input device can also be connected to the camera device 450 and the microphone 460 through the communication interface 430 for real-time collection of image signals and sound signals.
  • the structure of the fixed-point trace determining device is similar to the air-to-air input device shown in Figure 13. However, the fixed-point trace determining device does not need to perform the step of displaying fixed-point traces in the display area in the above-mentioned air-to-air input method.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated.
  • the available media may be magnetic media (eg, floppy disk, storage disk, tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Position Input By Displaying (AREA)

Abstract

The present application relates to an air input method, device and system. The method comprises: determining a fixed-point starting point in a display area; in a three-dimensional fixed-point movement in air, a first fixed-point device determines, according to the pose information of the first fixed-point device collected by a first vision sensor, the pose information of the first fixed-point device collected by a first inertial measurement unit in the first fixed-point device, and the mapping relation between the first pose information and a first fixed-point trajectory, the fixed-point trajectory of the first fixed-point device in air, wherein the first pose information comprises the pose information of a second fixed-point device collected by a second vision sensor and the pose information of the second fixed-point device collected by a second inertial measurement unit; and determining a fixed-point trace in the display area according to the fixed-point starting point and the fixed-point trajectory. The solution can improve the accuracy of determining the fixed-point trajectory in air.

Description

隔空输入方法、设备以及系统Air-to-air input methods, devices and systems 技术领域Technical field
本申请涉及电子设备,尤其涉及一种隔空输入方法、设备以及系统。The present application relates to electronic devices, and in particular, to an air-to-air input method, device and system.
背景技术Background technique
隔空输入系统可以满足教学、会议、大量办公等等团队协作场景的需求,是面向团队协作的新终端品类。目前,隔空输入系统被广泛应用于金融行业、教培行业、医疗行业等等,为简化这些行业的办公的复杂性起到重要的作用。The air-to-air input system can meet the needs of team collaboration scenarios such as teaching, meetings, and large-scale office work. It is a new terminal category for team collaboration. At present, the air-to-air input system is widely used in the financial industry, education and training industry, medical industry, etc., playing an important role in simplifying the complexity of office work in these industries.
如图1所示,隔空输入系统包括隔空输入设备110以及定点设备120。其中,隔空输入设备110可以包括但不限于:智慧屏、电子白板、投影设备、会议平板等等。下文中将以智慧屏为例进行说明。用户可以通过定点设备120隔空在智慧屏上进行输入。用户通过定点设备120隔空在智慧屏上进行输入是指:定点设备120与智慧屏并没有接触,用户通过定点设备120在空中产生定点轨迹,相应地,就能够在智慧屏上显示出与定点轨迹相对应的定点痕迹。这里,定点设备120与智慧屏的距离可以是50厘米、1米、2米、3米甚至更多。在理想状态下,用户通过定点设备120在空中描绘出的定点轨迹和在智慧屏上显示出的定点痕迹的形状应该是相同的,大小可以是成比例的。例如,定点轨迹和定点痕迹是1:2的圆形,定点轨迹和定点痕迹是1:1的正方形等等,定点轨迹和定点痕迹是2:1的三角形等等。应理解,上述定点轨迹和定点痕迹均以规则的图形为例进行说明,在实际应用中,定点轨迹和定点痕迹还可以是不规则的图形、数字、字符、字母、文字等等,此处不作具体限定。As shown in FIG. 1 , the air-to-air input system includes an air-to-air input device 110 and a pointing device 120 . Among them, the air-to-air input device 110 may include but is not limited to: smart screen, electronic whiteboard, projection device, conference tablet, etc. The following will take a smart screen as an example for explanation. The user can input on the smart screen through the pointing device 120 through the air. The user's input on the smart screen through the pointing device 120 means that the pointing device 120 is not in contact with the smart screen. The user generates a fixed-point trajectory in the air through the pointing device 120. Correspondingly, the user can display the fixed-point trajectory on the smart screen. The fixed point trace corresponding to the trajectory. Here, the distance between the pointing device 120 and the smart screen can be 50 centimeters, 1 meter, 2 meters, 3 meters or even more. In an ideal state, the shape of the fixed-point trajectory drawn by the user in the air through the pointing device 120 and the fixed-point trace displayed on the smart screen should be the same, and the size can be proportional. For example, fixed-point trajectories and fixed-point traces are 1:2 circles, fixed-point trajectories and fixed-point traces are 1:1 squares, etc., fixed-point trajectories and fixed-point traces are 2:1 triangles, etc. It should be understood that the above fixed-point trajectories and fixed-point traces are explained using regular graphics as examples. In practical applications, fixed-point trajectories and fixed-point traces can also be irregular graphics, numbers, characters, letters, words, etc., which are not discussed here. Specific limitations.
通过定点设备120的隔空输入功能,用户在远离智慧屏的位置上,也可以通过定点设备120隔空完成在智慧屏的输入,而无需走到智慧屏前面进行输入,为用户提供了极大的方便。例如,在老师在讲台上通过智慧屏显示演讲文稿时,如果有学生对其中某部分提出疑问,学生可以在自己的座位上通过定点设备120在智慧屏上的演讲文稿进行标记,而无需从自己的座位走到智慧屏前再在演讲文稿进行标记。Through the air-distance input function of the pointing device 120 , the user can complete input on the smart screen through the air at a location far away from the smart screen without having to go to the front of the smart screen for input, which provides users with great benefits. of convenience. For example, when the teacher displays the speech on the smart screen on the podium, if a student raises a question about a certain part of it, the student can use the pointing device 120 to mark the speech on the smart screen at his or her seat without having to use his or her own computer to mark the speech. Walk to your seat in front of the smart screen and mark the speech document.
但是,由于定点设备120中的惯性测量单元(inertial measurement unit,IMU)存在漂移误差的问题,使用包含IMU的定点设备进行隔空输入时,定点设备120在空中的定点轨迹和在智慧屏显示的定点痕迹往往不相同,例如,如图1所示,用户在空中以A点为起始点顺时针写下的定点轨迹为标准的圆形,但是,在智慧屏显示的定点痕迹却是以A’点顺时针写下的不规则的图形。However, due to the problem of drift error in the inertial measurement unit (IMU) in the pointing device 120, when using the pointing device containing the IMU for air input, the fixed-point trajectory of the pointing device 120 in the air is different from the one displayed on the smart screen. Fixed-point traces are often different. For example, as shown in Figure 1, the fixed-point trace written clockwise by the user in the air with point A as the starting point is a standard circle. However, the fixed-point trace displayed on the smart screen starts with A' Dot the irregular shape written clockwise.
发明内容Contents of the invention
本申请提供了一种隔空输入方法、设备以及系统,能够提高确定空中的定点轨迹的准确性。This application provides an air-to-air input method, device and system, which can improve the accuracy of determining fixed-point trajectories in the air.
第一方面,提供了一种隔空输入方法,包括:The first aspect provides an air-to-air input method, including:
隔空输入设备在显示区域中确定定点起点;第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的第一IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系 确定所述第一定点设备在空中的定点轨迹,其中,所述第一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息;根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹。The air input device determines the starting point of the fixed point in the display area; during the three-dimensional fixed point movement of the first pointing device in the air, the position and orientation information of the first pointing device collected by the first visual sensor, the first fixed point The posture information of the first pointing device collected by the first IMU in the pointing device and the mapping relationship between the first posture information and the first pointing trajectory Determine the fixed-point trajectory of the first pointing device in the air, wherein the first attitude information includes the position and attitude information of the second pointing device collected by the second visual sensor, and the second position and attitude information collected by the second IMU. Position information of the pointing device; determine the fixed point trace in the display area based on the fixed point starting point and the fixed point trajectory.
这里,第一视觉传感器和第二视觉传感器可以是同一个视觉传感器,也可以是两个不同的视觉传感器,例如,第一视觉传感器和第二视觉传感器可以是同一个立体相机,也可以是两个不同的立体相机。或者,第一视觉传感器可以是立体相机,第二视觉传感器是激光雷达等等。类似地,第一定点设备和第二定点设备可以是同一个设备,也可以不是同一个设备。当第一定点设备和第二定点设备是同一个设备时,第一IMU和第二IMU是同一个IMU,当第一定点设备和第二定点设备是两个不同的设备时,第一IMU和第二IMU是两个不同的IMU。在一些种可能的场景中,第一视觉传感器和第二视觉传感器是同一个视觉传感器,但是,第一定点设备和第二定点设备是不同的定点设备。或者,第一视觉传感器和第二视觉传感器是同一个视觉传感器,第一定点设备和第二定点设备是同一个定点设备。或者,第一视觉传感器和第二视觉传感器是不同的视觉传感器,但是,第一定点设备和第二定点设备是同一个定点设备。或者,第一视觉传感器和第二视觉传感器是不同的视觉传感器,第一定点设备和第二定点设备也不同的定点设备。Here, the first visual sensor and the second visual sensor may be the same visual sensor, or they may be two different visual sensors. For example, the first visual sensor and the second visual sensor may be the same stereo camera, or they may be two different visual sensors. different stereo cameras. Alternatively, the first visual sensor can be a stereo camera, the second visual sensor a lidar, and so on. Similarly, the first pointing device and the second pointing device may be the same device, or may not be the same device. When the first pointing device and the second pointing device are the same device, the first IMU and the second IMU are the same IMU. When the first pointing device and the second pointing device are two different devices, the first The IMU and the second IMU are two different IMUs. In some possible scenarios, the first visual sensor and the second visual sensor are the same visual sensor, but the first pointing device and the second pointing device are different pointing devices. Alternatively, the first visual sensor and the second visual sensor are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device. Alternatively, the first visual sensor and the second visual sensor are different visual sensors, but the first pointing device and the second pointing device are the same pointing device. Alternatively, the first visual sensor and the second visual sensor are different visual sensors, and the first pointing device and the second pointing device are also different pointing devices.
通过训练获得映射关系和使用映射关系确定定点轨迹可以是同时发生的,也可以是不同时发生的。当通过训练获得映射关系和使用映射关系确定定点轨迹是同时发生时,训练获得映射关系和使用映射关系确定定点轨迹是在相同的时空发生的,因此,在使用映射关系确定定点轨迹的同时,映射关系可以不断发生变化,以在使用映射关系的过程中,不断提高映射关系的准确性。当通过训练获得映射关系和使用映射关系确定定点轨迹是不同时发生时,可以先通过训练获得映射关系,然后,再使用映射关系确定定点轨迹。即,通过训练获得映射关系是在过去的时空发生的,而,使用映射关系确定定点轨迹是在现在的时空发生的,因此,在使用映射关系确定定点轨迹的同时,映射关系可以不再发生变化。因此,隔空输入设备可以不需要承担训练获得映射关系的任务,减轻隔空输入设备的负荷。当然,也可以根据实际情况,而仍然设置由隔空输入设备来训练获得映射关系。Obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory may occur at the same time or not at the same time. When obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory occur at the same time, training to obtain the mapping relationship and using the mapping relationship to determine the fixed-point trajectory occur in the same time and space. Therefore, while using the mapping relationship to determine the fixed-point trajectory, mapping The relationship can continue to change to continuously improve the accuracy of the mapping relationship in the process of using the mapping relationship. When obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory do not occur at the same time, the mapping relationship can be obtained through training first, and then the mapping relationship is used to determine the fixed-point trajectory. That is, obtaining the mapping relationship through training occurred in the past time and space, while using the mapping relationship to determine the fixed-point trajectory occurred in the present time and space. Therefore, while using the mapping relationship to determine the fixed-point trajectory, the mapping relationship can no longer change. . Therefore, the air-to-air input device does not need to undertake the task of training to obtain the mapping relationship, reducing the load of the air-to-air input device. Of course, according to the actual situation, the mapping relationship can still be obtained by training through the air input device.
上述方案中,在确定第一定点设备在空中的定点轨迹时,同时参考了第一视觉传感器采集的第一定点设备的位姿信息和所述第一定点设备中的第一IMU采集的所述第一定点设备的位姿信息,即,通过多角度的传感器采集的第一定点设备的位姿信息来确定第一定点设备在空中的定点轨迹,从而提高了确定空中的定点轨迹的准确性。In the above solution, when determining the fixed-point trajectory of the first pointing device in the air, the pose information of the first pointing device collected by the first visual sensor and the first IMU collection of the first pointing device are simultaneously referred to. The position and attitude information of the first pointing device, that is, the position and attitude information of the first pointing device collected by multi-angle sensors is used to determine the fixed point trajectory of the first pointing device in the air, thereby improving the accuracy of determining the position and attitude of the first pointing device in the air. Accuracy of fixed-point trajectories.
在一些可能的设计中,所述第一视觉传感器采集的所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度,所述用户的三维模型和所述第一定点设备的三维模型是根据所述第一视觉传感器采集的所述用户的三维数据和所述第一定点设备的三维数据建立的。In some possible designs, the pose information of the first pointing device collected by the first visual sensor includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, and, the first The rotation angle of the wrist part in the three-dimensional model of the pointing device relative to the user's three-dimensional model. The three-dimensional model of the user and the three-dimensional model of the first pointing device are the user's three-dimensional model collected according to the first visual sensor. The three-dimensional data and the three-dimensional data of the first pointing device are established.
这里,第一定点设备的三维模型中的点可以是第一定点设备的三维模型中的任意点,例如,可以是第一定点设备的端点、中点等等。在一更具体的实施例中,当第一定点设备是书写笔时,第一定点设备的三维模型中的点可以是书写笔的笔尖。Here, the point in the three-dimensional model of the first pointing device may be any point in the three-dimensional model of the first pointing device, for example, it may be the end point, midpoint, etc. of the first pointing device. In a more specific embodiment, when the first pointing device is a writing pen, the point in the three-dimensional model of the first pointing device may be the tip of the writing pen.
上述方案中,是通过第一视觉传感器采集用户的三维数据和第一定点设备的三维数据来建立三维模型,并通过用户的三维模型和所述第一定点设备的三维模型来确定第一定点设备 的位姿信息,因为通过视觉传感器采集三维数据建模具有准确度非常高的特点,采集的第一定点设备的位姿信息的准确度也非常高,因此,确定第一定点设备在空中的定点轨迹准确性也非常高。In the above solution, the first visual sensor collects the user's three-dimensional data and the three-dimensional data of the first pointing device to establish a three-dimensional model, and determines the first position through the user's three-dimensional model and the three-dimensional model of the first pointing device. pointing device Because the three-dimensional data modeling collected through the visual sensor has the characteristics of very high accuracy, the accuracy of the collected posture information of the first pointing device is also very high. Therefore, it is determined that the first pointing device is in the air. The accuracy of the fixed-point trajectory is also very high.
在一些可能的设计中,所述定点起点是所述用户的三维模型或所述第一定点设备的三维模型中的特定部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
这里,用户的三维模型的特定部位可以是用户的三维模型中的任意部位,例如,眼睛部位、鼻尖部位、手指尖端部位等等。并且,可以结合不同的条件设置用户的三维模型的特定部位的优先等级,例如,当用户是用手指指向显示区域时,优先使用手指尖端部位来确定定点起点,当用户没有用手指指向显示区域时,优先使用眼睛部位来确定定点起点。第一定点设备的三维模型中的特定部位可以是第一定点设备的三维模型中的任意部位,例如,端点部位、中心部位等等。Here, the specific part of the user's three-dimensional model may be any part of the user's three-dimensional model, for example, the eye part, the nose tip part, the finger tip part, etc. Moreover, the priority of specific parts of the user's three-dimensional model can be set based on different conditions. For example, when the user points to the display area with a finger, the tip of the finger is used first to determine the starting point of the fixed point. When the user does not point to the display area with a finger, , using the eye part first to determine the starting point of the fixed point. The specific part in the three-dimensional model of the first pointing device may be any part in the three-dimensional model of the first pointing device, for example, an endpoint part, a center part, etc.
在现有的技术方案中,是根据第一定点设备中的IMU的位姿信息来确定定点起点的,因此,根据IMU的位姿信息来确定的定点起点往往不是用户想要的定点起点,用户往往需要多次调整定点设备才能够定点到用户想要的定点起点,造成效率非常低下。而在本方案中,是结合视觉传感器来确定定点起点的,能够更加准确以及高效地确定用户想要的定点起点。另外,因为通过视觉传感器采集三维数据建模具有准确度非常高的特点,通过视觉传感器采集的三维数据建立的用户的三维模型或所述第一定点设备的三维模型来确定定点起点准确性也非常高。In the existing technical solution, the starting point of the fixed point is determined based on the pose information of the IMU in the first pointing device. Therefore, the starting point of the fixed point determined based on the pose information of the IMU is often not the starting point of the fixed point desired by the user. Users often need to adjust the pointing device multiple times to reach the starting point they want, resulting in very low efficiency. In this solution, a visual sensor is used to determine the fixed starting point, which can more accurately and efficiently determine the fixed starting point desired by the user. In addition, because the three-dimensional data modeling collected by the visual sensor has the characteristics of very high accuracy, it is also accurate to determine the starting point of the fixed point through the user's three-dimensional model established by the three-dimensional data collected by the visual sensor or the three-dimensional model of the first pointing device. very high.
在一些可能的设计中,所述定点起点是所述用户的三维模型中眼睛部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
上述方案中,通过用户的三维模型中眼睛部位的法向量与所述显示区域的交点来确定定点起点,不仅具有准确度高的特点,而且,用户眼睛看哪里,定点起点就在哪里,使用起来非常方便以及人性化,对用户非常友好。In the above solution, the starting point of the fixed point is determined through the intersection of the normal vector of the eye part in the user's three-dimensional model and the display area. It not only has the characteristics of high accuracy, but also the starting point of the fixed point is wherever the user's eyes look. Very convenient and user-friendly.
在一些可能的设计中,确定所述第一定点设备在空中的定点轨迹之前,所述方法还包括:In some possible designs, before determining the fixed-point trajectory of the first pointing device in the air, the method further includes:
接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行定点运动得到的;Receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is obtained by the robotic arm controlling the second fixed-pointing device to perform fixed-point movement in the air;
接收所述第二定点设备的第一位姿信息,其中,所述第一位姿信息是所述机械手臂控制所述第二定点设备在空中进行定点运动时的所述第二定点设备的位姿信息;Receive first attitude information of the second pointing device, wherein the first attitude information is the position of the second pointing device when the robot arm controls the second pointing device to perform a fixed point movement in the air. posture information;
通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。The neural network is trained through the first posture information and the first fixed point trajectory to obtain the mapping relationship.
上述方案中,通过机械手臂产生定点轨迹比人工产生定点轨迹的准确度高,稳定性好,在训练时使用机械手臂产生定点轨迹可以获得更准确的映射关系。In the above scheme, the fixed-point trajectory generated by the robotic arm is more accurate and stable than the fixed-point trajectory generated manually. Using the robotic arm to generate the fixed-point trajectory during training can obtain a more accurate mapping relationship.
在一些可能的设计中,所述第一视觉传感器包括立体相机、激光雷达、深度相机或单目相机。这里,在对定点轨迹准确度要求较高的场景时,可以采用激光雷达,深度相机以及立体相机以作为第一视角传感器,在对定点轨迹准确度要求不高的情况下,可以采用单目相机以降低成本,但是,在使用单目相机前,需要对单目相机进行训练以使得单目相机具有获取三维数据的能力。In some possible designs, the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera. Here, in scenarios where the accuracy of fixed-point trajectories is high, lidar, depth cameras and stereo cameras can be used as first-view sensors. When the accuracy of fixed-point trajectories is not high, a monocular camera can be used. To reduce costs, however, before using a monocular camera, the monocular camera needs to be trained so that the monocular camera has the ability to acquire three-dimensional data.
在一些可能的设计中,所述第一视觉传感器设置于所述显示区域中,也可以设置在所述显示区域之外。这里,当第一视觉传感器设置在所述显示区域中时,第一视觉传感器的视线轴心线和显示区域的法线的夹角为零,此时,在确定根据第一视角传感器建立的用户的三维 模型和第一定点设备的三维模型的特定部位的法向量与所述显示区域的交点时,可以简化计算的复杂度。当第一视觉传感器设置在显示区域之外时,可以增加第一视觉传感器设置的自由度,尤其是某些特定的场合中,第一视角传感器不方便设置在显示区域中时,还可以设置在显示区域之外。In some possible designs, the first visual sensor is disposed in the display area, or may be disposed outside the display area. Here, when the first visual sensor is disposed in the display area, the angle between the line of sight axis of the first visual sensor and the normal line of the display area is zero. At this time, when determining the user's angle established based on the first viewing angle sensor three-dimensional When the normal vector of a specific part of the three-dimensional model of the model and the first pointing device intersects with the display area, the complexity of the calculation can be simplified. When the first visual sensor is set outside the display area, the degree of freedom in setting the first visual sensor can be increased. Especially in some specific occasions, when it is inconvenient to set the first viewing angle sensor in the display area, it can also be set in the display area. outside the display area.
第二方面,提供了一种隔空输入设备,包括:处理器以及显示单元,所述处理器连接所述显示单元,In a second aspect, an air-to-air input device is provided, including: a processor and a display unit, the processor being connected to the display unit,
所述处理器用于在所述显示单元产生的显示区域中确定定点起点,第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹,其中,所述第一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息;根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹;The processor is used to determine the starting point of a fixed point in the display area generated by the display unit. During the three-dimensional fixed point movement of the first pointing device in the air, the position and orientation of the first pointing device collected by the first visual sensor are Information, the posture information of the first pointing device collected by the IMU in the first pointing device, and the mapping relationship between the first posture information and the first pointing trajectory determine the first pointing device A fixed-point trajectory in the air, wherein the first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the posture information of the second pointing device collected by the second IMU; according to The fixed point starting point and the fixed point trajectory determine the fixed point trace in the display area;
所述显示单元用于在所述显示区域中显示所述定点痕迹。The display unit is used to display the fixed point trace in the display area.
在一些可能的设计中,所述设备还包括接收器,所述接收器用于接收所述第一视觉传感器采集的所述第一定点设备的位姿信息,其中,所述第一视觉传感器采集的所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度,所述用户的三维模型和所述第一定点设备的三维模型是根据所述第一视觉传感器采集的所述用户的三维数据和所述第一定点设备的三维数据建立的。In some possible designs, the device further includes a receiver, the receiver is configured to receive the posture information of the first pointing device collected by the first visual sensor, wherein the first visual sensor collects The pose information of the first pointing device includes the three-dimensional coordinates of the point in the three-dimensional model of the first pointing device, and the three-dimensional model of the first pointing device is relative to the user's wrist in the three-dimensional model. The rotation angle of the part, the three-dimensional model of the user and the three-dimensional model of the first pointing device are established based on the three-dimensional data of the user and the three-dimensional data of the first pointing device collected by the first visual sensor of.
在一些可能的设计中,所述定点起点是所述用户的三维模型或所述第一定点设备的三维模型中的特定部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
在一些可能的设计中,所述定点起点是所述用户的三维模型中眼睛部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
在一些可能的设计中,所述接收器还用于接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行定点运动得到的;所述接收器还用于接收所述第二定点设备的第一位姿信息,其中,所述第一位姿信息是所述机械手臂控制所述第二定点设备在空中进行定点运动时所述第二定点设备的位姿信息;所述处理器还用于通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。In some possible designs, the receiver is also used to receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is the second fixed-pointing device controlled by the robotic arm in the air. Obtained by performing fixed-point movement; the receiver is also used to receive the first attitude information of the second pointing device, wherein the first attitude information is obtained by controlling the second pointing device in the air by the robotic arm. The posture information of the second fixed-point device when performing fixed-point movement; the processor is also configured to train a neural network through the first posture information and the first fixed-point trajectory to obtain the mapping relationship.
在一些可能的设计中,所述第一视觉传感器包括立体相机、激光雷达、深度相机或单目相机。In some possible designs, the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera.
第三方面,提供了一种隔空输入系统,包括:In the third aspect, an air-to-air input system is provided, including:
第一定点设备,用于在空中进行三维定点运动,以及,通过所述第一定点设备中的第一IMU采集所述第一定点设备的位姿信息;A first pointing device, used for performing three-dimensional fixed-point movement in the air, and collecting the pose information of the first pointing device through the first IMU in the first pointing device;
第一视觉传感器,用于采集所述第一定点设备的位姿信息;A first visual sensor used to collect position and orientation information of the first pointing device;
隔空输入设备,用于在所述隔空输入设备产生的显示区域中确定定点起点,在第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的第一IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹,其中,所述第 一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息;根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹;An air-to-air input device is used to determine the starting point of a fixed point in the display area generated by the air-to-air input device. During the three-dimensional fixed-point movement of the first pointing device in the air, according to the first fixed point collected by the first visual sensor Determination of the mapping relationship between the posture information of the pointing device, the posture information of the first pointing device collected by the first IMU in the first pointing device, and the first posture information and the first pointing trajectory The fixed-point trajectory of the first pointing device in the air, wherein the first pointing device One posture information includes the posture information of the second fixed-point device collected by the second visual sensor, and the posture information of the second fixed-point device collected by the second IMU; according to the fixed-point starting point and the fixed-point trajectory, it is determined fixed-point traces in the display area;
所述隔空输入设备,还用于在所述显示区域中显示所述定点痕迹。The air-to-air input device is also used to display the fixed-point trace in the display area.
这里,隔空输入设备可以是智慧屏,也可以是带有定点痕迹确定功能的投影仪等等。当隔空输入设备为智慧屏时,所述隔空输入设备产生的显示区域是指智慧屏的显示区域,当隔空输入设备为带有定点痕迹确定功能的投影仪时,所述隔空输入设备产生的显示区域是指所述投影仪产生的投影区域。Here, the air-to-air input device can be a smart screen, or a projector with a fixed-point trace determination function, etc. When the air-to-air input device is a smart screen, the display area generated by the air-to-air input device refers to the display area of the smart screen. When the air-to-air input device is a projector with a fixed-point trace determination function, the air-to-air input device The display area generated by the device refers to the projection area generated by the projector.
在一种可能的实施方式中,定点痕迹确定功能由外设的定点痕迹确定设备实现,而没有跟投影仪集成在一起。该定点痕迹确定功能包括接收第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息,并根据第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息确定定点痕迹。投影仪可以用于在投影区域中显示所述定点痕迹。In a possible implementation, the fixed-point trace determination function is implemented by a peripheral fixed-point trace determination device and is not integrated with the projector. The fixed-point trace determination function includes receiving the position and orientation information of the first pointing device collected by the first visual sensor and the position and orientation information of the first pointing device collected by the first IMU, and based on the first fixed-pointing device collected by the first visual sensor. The posture information of the pointing device and the first IMU collect the posture information of the first pointing device to determine the pointing trace. A projector can be used to display the fixed point traces in the projection area.
在一些可能的设计中,所述第一视觉传感器采集的所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度,所述用户的三维模型和所述第一定点设备的三维模型是根据所述第一视觉传感器采集的所述用户的三维数据和所述第一定点设备的三维数据建立的。In some possible designs, the pose information of the first pointing device collected by the first visual sensor includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, and, the first The rotation angle of the wrist part in the three-dimensional model of the pointing device relative to the user's three-dimensional model. The three-dimensional model of the user and the three-dimensional model of the first pointing device are the user's three-dimensional model collected according to the first visual sensor. The three-dimensional data and the three-dimensional data of the first pointing device are established.
在一些可能的设计中,所述定点起点是所述用户的三维模型或所述第一定点设备的三维模型中的特定部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
在一些可能的设计中,所述定点起点是所述用户的三维模型中眼睛部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
在一些可能的设计中,所述隔空输入设备,还用于接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行三维定点运动得到的;In some possible designs, the air-to-air input device is also used to receive the first fixed point trajectory sent by the robotic arm, wherein the first fixed point trajectory is the second fixed point controlled by the robotic arm. Obtained by the equipment performing three-dimensional fixed-point movement in the air;
所述隔空输入设备,还用于接收所述第二定点设备的第一位姿信息,其中,所述第一位姿信息是所述机械手臂控制所述第二定点设备在空中进行定点运动时所述第二定点设备的位姿信息;The air-to-air input device is also used to receive the first attitude information of the second pointing device, wherein the first attitude information is the robot arm controlling the second pointing device to perform fixed-point movement in the air. The pose information of the second fixed-point device;
所述隔空输入设备,还用于通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。The air-to-air input device is also used to train a neural network through the first attitude information and the first fixed point trajectory to obtain the mapping relationship.
在一些可能的设计中,所述第一视觉传感器包括立体相机、激光雷达、深度相机或单目相机。In some possible designs, the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera.
第四方面,提供了一种隔空输入设备,包括:The fourth aspect provides an air-to-air input device, including:
起点确定单元,用于在显示区域中确定定点起点;The starting point determination unit is used to determine the fixed point starting point in the display area;
轨迹确定单元,用于在第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的第一IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹,其中,所述第一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息; A trajectory determination unit, configured to determine the position and orientation information of the first pointing device and the first pointing device in the first pointing device according to the position and orientation information collected by the first visual sensor during the three-dimensional pointing movement of the first pointing device in the air. The position and orientation information of the first pointing device collected by an IMU and the mapping relationship between the first position information and the first fixed point trajectory determine the fixed point trajectory of the first pointing device in the air, wherein, The first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the posture information of the second pointing device collected by the second IMU;
痕迹确定单元,用于根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹。A trace determination unit is configured to determine fixed-point traces in the display area according to the fixed-point starting point and the fixed-point trajectory.
在一些可能的设计中,所述第一视觉传感器采集的所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度,所述用户的三维模型和所述第一定点设备的三维模型是根据所述第一视觉传感器采集的所述用户的三维数据和所述第一定点设备的三维数据建立的。In some possible designs, the pose information of the first pointing device collected by the first visual sensor includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, and, the first The rotation angle of the wrist part in the three-dimensional model of the pointing device relative to the user's three-dimensional model. The three-dimensional model of the user and the three-dimensional model of the first pointing device are the user's three-dimensional model collected according to the first visual sensor. The three-dimensional data and the three-dimensional data of the first pointing device are established.
在一些可能的设计中,所述定点起点是所述用户的三维模型或所述第一定点设备的三维模型中的特定部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of a specific part in the three-dimensional model of the user or the three-dimensional model of the first pointing device and the display area.
在一些可能的设计中,所述定点起点是所述用户的三维模型中眼睛部位的法向量与所述显示区域的交点。In some possible designs, the starting point of the fixed point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
在一些可能的设计中,所述设备还包括训练单元,所述训练单元用于接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行三维定点运动得到的;接收所述第一位姿信息,通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。In some possible designs, the device further includes a training unit configured to receive the first fixed-point trajectory sent by the robotic arm, where the first fixed-point trajectory is controlled by the robotic arm. The second fixed-point device is obtained by performing three-dimensional fixed-point movement in the air; receiving the first attitude information, training the neural network through the first attitude information and the first fixed-point trajectory, and obtaining the mapping relationship .
在一些可能的设计中,所述第一视觉传感器包括立体相机、激光雷达、深度相机或单目相机。In some possible designs, the first visual sensor includes a stereo camera, lidar, depth camera or monocular camera.
第五方面,提供了一种计算机可读存储介质,其特征在于,包括指令,当所述指令在隔空输入设备上运行时,使得所述隔空输入设备执行如第一方面任一项所述的方法。In a fifth aspect, a computer-readable storage medium is provided, which is characterized in that it includes instructions that, when the instructions are run on an air-to-air input device, cause the air-to-air input device to execute any one of the first aspects. method described.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或背景技术中的技术方案,下面将对本申请实施例或背景技术中所需要使用的附图进行说明。In order to more clearly explain the technical solutions in the embodiments of the present application or the background technology, the drawings required to be used in the embodiments or the background technology of the present application will be described below.
图1是本申请涉及的一种隔空输入场景的示意图;Figure 1 is a schematic diagram of an air-to-air input scenario involved in this application;
图2是本申请涉及的一种隔空输入系统的结构示意图;Figure 2 is a schematic structural diagram of an air-to-air input system involved in this application;
图3是本申请提供的一种智慧屏的结构示意图;Figure 3 is a schematic structural diagram of a smart screen provided by this application;
图4是本申请提供的一种定点设备的结构示意图;Figure 4 is a schematic structural diagram of a pointing device provided by this application;
图5是本申请提供的一种IMU的结构示意图;Figure 5 is a schematic structural diagram of an IMU provided by this application;
图6是本申请提供的一种隔空输入系统的结构示意图;Figure 6 is a schematic structural diagram of an air-to-air input system provided by this application;
图7是本申请提供的一种投影设备的结构示意图;Figure 7 is a schematic structural diagram of a projection device provided by this application;
图8是本申请提供的一种隔空输入方法的流程示意图;Figure 8 is a schematic flow chart of an air-to-air input method provided by this application;
图9是本申请提供的IMU采集的定点设备的位姿信息的示意图;Figure 9 is a schematic diagram of the pose information of the fixed-point device collected by the IMU provided by this application;
图10是本申请提供的一种根据用户的三维模型的眼睛部分确定定点起点的示意图;Figure 10 is a schematic diagram of determining the starting point of a fixed point based on the eye part of the user's three-dimensional model provided by this application;
图11是本申请提供的一种轨迹预测模型的示意图;Figure 11 is a schematic diagram of a trajectory prediction model provided by this application;
图12示出了本申请提供的一种隔空输入设备的结构示意图;Figure 12 shows a schematic structural diagram of an air-to-air input device provided by this application;
图13示出了本申请提供的一种隔空输入设备的结构示意图。Figure 13 shows a schematic structural diagram of an air-to-air input device provided by this application.
具体实施方式Detailed ways
参见图2,图2是本申请提供的一种隔空输入系统的结构示意图。如图2所示,本申请 提供的隔空输入系统,包括:智慧屏110、定点设备120以及视觉传感器130。Refer to Figure 2, which is a schematic structural diagram of an air-to-air input system provided by the present application. As shown in Figure 2, this application The provided air-to-air input system includes: a smart screen 110, a pointing device 120, and a visual sensor 130.
如图3所示,智慧屏110可包括:处理器112、存储器113、无线通信模块114、电源开关115、有线局域网(local area network,LAN)通信模块116、高清多媒体接口(high definition multimedia interface,HDMI)通信模块117、通用串行总线(universal serial bus,USB)通信模块118和显示器119。其中:As shown in Figure 3, the smart screen 110 may include: a processor 112, a memory 113, a wireless communication module 114, a power switch 115, a wired local area network (LAN) communication module 116, and a high definition multimedia interface. HDMI) communication module 117, universal serial bus (universal serial bus, USB) communication module 118 and display 119. in:
处理器112可用于读取和执行计算机可读指令。具体实现中,处理器112可主要包括控制器、运算器和寄存器。其中,控制器主要负责指令译码,并为指令对应的操作发出控制信号。运算器主要负责执行定点或浮点算数运算操作、移位操作以及逻辑操作等,也可以执行地址运算和转换。寄存器主要负责保存指令执行过程中临时存放的寄存器操作数和中间操作结果等。Processor 112 may be used to read and execute computer-readable instructions. In specific implementation, the processor 112 may mainly include a controller, arithmetic unit, and a register. Among them, the controller is mainly responsible for decoding instructions and issuing control signals for operations corresponding to the instructions. The arithmetic unit is mainly responsible for performing fixed-point or floating-point arithmetic operations, shift operations, and logical operations. It can also perform address operations and conversions. Registers are mainly responsible for storing register operands and intermediate operation results temporarily stored during instruction execution.
在一些实施例中,处理器112可以用于解析无线通信模块114和/或有线LAN通信模块116接收到的信号。处理器112可以用于根据解析结果进行相应的处理操作,如生成探测响应,又如根据该显示请求或显示指令驱动显示器119执行显示,等等。In some embodiments, processor 112 may be used to parse signals received by wireless communication module 114 and/or wired LAN communication module 116 . The processor 112 may be configured to perform corresponding processing operations according to the parsing results, such as generating a detection response, driving the display 119 to perform display according to the display request or display instruction, and so on.
在一些实施例中,处理器112还可以用于生成无线通信模块114和/或有线LAN通信模块116向外发送的信号,如蓝牙广播信号、信标信号,又如向电子设备发送的用于反馈显示状态(如显示成功、显示失败等)的信号。In some embodiments, the processor 112 can also be used to generate signals sent externally by the wireless communication module 114 and/or the wired LAN communication module 116, such as Bluetooth broadcast signals, beacon signals, or signals sent to electronic devices. Feedback the signal of display status (such as display success, display failure, etc.).
存储器113与处理器112耦合,用于存储各种软件程序和/或多组指令。具体实现中,存储器113可包括高速随机存取的存储器,并且也可包括非易失性存储器,例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备。存储器113可以存储操作系统,例如uCOS、VxWorks、RTLinux等嵌入式操作系统。存储器113还可以存储通信程序,该通信程序可用于与一个或多个服务器,或附加设备进行通信。Memory 113 is coupled to processor 112 for storing various software programs and/or sets of instructions. In specific implementations, the memory 113 may include high-speed random access memory, and may also include non-volatile memory, such as one or more disk storage devices, flash memory devices or other non-volatile solid-state storage devices. The memory 113 can store operating systems, such as uCOS, VxWorks, RTLinux and other embedded operating systems. Memory 113 may also store communications programs that may be used to communicate with one or more servers, or additional devices.
无线通信模块114可以包括蓝牙(bluetooth)通信模块114A、无线局域网(wireless local area networks,WLAN)通信模块114B中的一项或多项。The wireless communication module 114 may include one or more of a Bluetooth (bluetooth) communication module 114A and a wireless local area networks (WLAN) communication module 114B.
在一些实施例中,蓝牙(BT)通信模块114A、WLAN通信模块114B中的一项或多项可以监听到其他设备发射的信号,如探测请求、扫描信号等等,并可以发送响应信号,如探测响应、扫描响应等,使得其他设备可以发现智慧屏110,并与其他设备建立无线通信连接,通过蓝牙或WLAN中的一种或多种无线通信技术与其他设备进行通信。In some embodiments, one or more of the Bluetooth (BT) communication module 114A and the WLAN communication module 114B can monitor signals transmitted by other devices, such as detection requests, scanning signals, etc., and can send response signals, such as The detection response, scanning response, etc. enable other devices to discover the smart screen 110, establish wireless communication connections with other devices, and communicate with other devices through one or more wireless communication technologies in Bluetooth or WLAN.
在另一些实施例中,蓝牙通信模块114A、WLAN通信模块114B中的一项或多项也可以发射信号,如广播蓝牙信号、信标信号,使得其他设备可以发现智慧屏110,并与其他设备建立无线通信连接,通过蓝牙或WLAN中的一种或多种无线通信技术与其他设备进行通信。In other embodiments, one or more of the Bluetooth communication module 114A and the WLAN communication module 114B can also transmit signals, such as broadcasting Bluetooth signals and beacon signals, so that other devices can discover the smart screen 110 and communicate with other devices. Establish a wireless communication connection to communicate with other devices through one or more wireless communication technologies such as Bluetooth or WLAN.
无线通信模块114还可以包括:蓝牙、WLAN、近场通信(near field communication,NFC)、超宽带(ultra wide band,UWB)、红外等等。The wireless communication module 114 may also include: Bluetooth, WLAN, near field communication (NFC), ultra wide band (UWB), infrared, and so on.
电源开关115可用于控制电源向智慧屏110的供电。The power switch 115 can be used to control the power supply to the smart screen 110 .
有线LAN通信模块116可用于通过有线LAN和同一个LAN中的其他设备进行通信,还可用于通过有线LAN连接到广域网,可与广域网中的设备通信。The wired LAN communication module 116 can be used to communicate with other devices in the same LAN through the wired LAN, and can also be used to connect to the wide area network through the wired LAN and communicate with devices in the wide area network.
HDMI通信模块117可用于通过HDMI接口(未示出)与其他设备进行通信。HDMI communication module 117 may be used to communicate with other devices through an HDMI interface (not shown).
USB通信模块118可用于通过USB接口(未示出)与其他设备进行通信。USB communication module 118 may be used to communicate with other devices through a USB interface (not shown).
显示器119可用于显示图像,视频等。显示器119可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED)显示屏,有源矩 阵有机发光二极体(active-matrix organic light emitting diode,AMOLED)显示屏,柔性发光二极管(flexible light-emitting diode,FLED)显示屏,量子点发光二极管(quantum dot light emitting diodes,QLED)显示屏等等。Display 119 may be used to display images, videos, etc. The display 119 can be a liquid crystal display (LCD), an organic light-emitting diode (OLED) display, or an active matrix display. Active-matrix organic light emitting diode (AMOLED) display, flexible light-emitting diode (FLED) display, quantum dot light emitting diode (QLED) display etc.
在一些实施例中,智慧屏110还可以包括音频模块(未示出)。音频模块可用于通过音频输出接口输出音频信号,这样可使得智慧屏110支持音频播放。音频模块还可用于通过音频输入接口接收音频数据。智慧屏110可以为电视机等媒体播放设备。In some embodiments, the smart screen 110 may also include an audio module (not shown). The audio module can be used to output audio signals through the audio output interface, so that the smart screen 110 supports audio playback. The audio module can also be used to receive audio data through the audio input interface. The smart screen 110 can be a media playback device such as a television.
在一些实施例中,智慧屏110还可以包括RS-232接口等串行接口。该串行接口可连接至其他设备,如音箱等音频外放设备,使得智慧屏110和音频外放设备协作播放音视频。In some embodiments, the smart screen 110 may also include a serial interface such as an RS-232 interface. The serial interface can be connected to other devices, such as speakers and other audio external amplifiers, so that the smart screen 110 and the audio external amplifiers can cooperate to play audio and video.
可以理解的是图3示意的结构并不构成对智慧屏110的具体限定。在本申请另一些实施例中,智慧屏110可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structure illustrated in Figure 3 does not constitute a specific limitation on the smart screen 110. In other embodiments of the present application, the smart screen 110 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange different components. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.
如图4所示,定点设备120可以是设置有IMU的绘制设备。定点设备可以是笔状的设备,也可以是其他形状的设备。定点设备可以是书写笔、控制器等等。定点设备120可包括:处理器122、存储器123、无线通信模块124、充电管理模块125、USB接口126、电池127、电源管理模块128以及IMU 129。As shown in FIG. 4 , the pointing device 120 may be a drawing device provided with an IMU. The pointing device can be a pen-shaped device or a device of other shapes. Pointing devices can be pens, controllers, etc. The pointing device 120 may include: a processor 122, a memory 123, a wireless communication module 124, a charging management module 125, a USB interface 126, a battery 127, a power management module 128 and an IMU 129.
处理器122可用于读取和执行计算机可读指令。具体实现中,处理器122可主要包括控制器、运算器和寄存器。其中,控制器主要负责指令译码,并为指令对应的操作发出控制信号。运算器主要负责执行定点或浮点算数运算操作、移位操作以及逻辑操作等,也可以执行地址运算和转换。寄存器主要负责保存指令执行过程中临时存放的寄存器操作数和中间操作结果等。此外,处理器122也可以采用异构架构,例如,ARM+DSP的架构,ARM+ASIC的架构,ARM+AI芯片的架构等等。Processor 122 may be used to read and execute computer-readable instructions. In specific implementation, the processor 122 may mainly include a controller, arithmetic unit, and a register. Among them, the controller is mainly responsible for decoding instructions and issuing control signals for operations corresponding to the instructions. The arithmetic unit is mainly responsible for performing fixed-point or floating-point arithmetic operations, shift operations, and logical operations. It can also perform address operations and conversions. Registers are mainly responsible for storing register operands and intermediate operation results temporarily stored during instruction execution. In addition, the processor 122 may also adopt a heterogeneous architecture, such as an ARM+DSP architecture, an ARM+ASIC architecture, an ARM+AI chip architecture, and so on.
存储器123与处理器122耦合,用于存储各种软件程序和/或多组指令。具体实现中,存储器123可包括高速随机存取的存储器,并且也可包括非易失性存储器,例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备。存储器123可以存储操作系统,例如uCOS、VxWorks、RTLinux等嵌入式操作系统。Memory 123 is coupled to processor 122 for storing various software programs and/or sets of instructions. In specific implementations, the memory 123 may include high-speed random access memory, and may also include non-volatile memory, such as one or more disk storage devices, flash memory devices or other non-volatile solid-state storage devices. The memory 123 can store operating systems, such as uCOS, VxWorks, RTLinux and other embedded operating systems.
无线通信模块124可以包括蓝牙(BT)通信模块、WLAN通信模块、近场通信(near field communication,NFC)、超宽带(ultra wide band,UWB)、红外等等中的一项或多项。The wireless communication module 124 may include one or more of a Bluetooth (BT) communication module, a WLAN communication module, near field communication (NFC), ultra wide band (UWB), infrared, and the like.
充电管理模块125用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块125可以通过USB接口126接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块125可以通过无线充电线圈接收无线充电输入。充电管理模块125为电池127充电的同时,还可以通过电源管理模块128为定点设备供电。The charge management module 125 is used to receive charging input from the charger. Among them, the charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charging management module 125 may receive charging input from the wired charger through the USB interface 126 . In some wireless charging embodiments, the charging management module 125 may receive wireless charging input through a wireless charging coil. While the charging management module 125 charges the battery 127, it can also provide power to the pointing device through the power management module 128.
电源管理模块128用于连接电池127,充电管理模块125与处理器122。电源管理模块128接收电池127和/或充电管理模块125的输入,为处理器122,存储器123和无线通信模块124,IMU 129等供电。电源管理模块128还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块128也可以设置于处理器122中。在另一些实施例中,电源管理模块128和充电管理模块125也可以设置于同一个器件中。 The power management module 128 is used to connect the battery 127, the charging management module 125 and the processor 122. The power management module 128 receives input from the battery 127 and/or the charging management module 125 and supplies power to the processor 122, the memory 123, the wireless communication module 124, the IMU 129, and the like. The power management module 128 can also be used to monitor battery capacity, battery cycle times, battery health status (leakage, impedance) and other parameters. In some other embodiments, the power management module 128 may also be provided in the processor 122 . In other embodiments, the power management module 128 and the charging management module 125 can also be provided in the same device.
IMU 129是测量物体三轴角速度和加速度的设备。一个IMU内可能会装有三轴陀螺仪和三轴加速度计,来测量物体在三维空间中的角速度和加速度。严格意义上的IMU只为用户提供三轴角速度以及三轴加速度数据。垂直参考单元(vertical reference unit,VRU)是在IMU的基础上,以重力向量作为参考,用卡尔曼或者互补滤波等算法为用户提供有重力向量参考的俯仰角、横滚角以及无参考标准的航向角。通常所说的6轴姿态模块就属于这类系统。航向角没有参考,不管模块朝向哪里,启动后航向角都为0°(或一个设定的常数)。随着模块工作时间增加,航向角会缓慢累计误差。俯仰角,横滚角由于有重力向量参考,低机动运动情况下,长时间不会有累积误差。航姿参考系统(attitude and heading reference system,AHRS)系统是在VRU的基础上增加了磁力计或光流传感器,用卡尔曼或者互补滤波等算法为用户提供拥有绝对参考的俯仰角、横滚角以及航向角的设备,这类系统用来为飞行器提供准确可靠的姿态与航行信息。我们通常所说的9轴姿态传感器就属于这类系统,因为航向角有地磁场的参考,所以不会漂移。但地磁场很微弱,经常受到周围带磁物体的干扰,所以如何在高机动情况下抵抗各种磁干扰成为AHRS研究的热门。如图5所示,将加速度计、陀螺仪、磁力计等作为基本输入,通过数据采集单元、校准与补偿单元、数据融合单元、输出与配置单元输出用户所需要的姿态信息、位置信息以及速度信息。IMU 129 is a device that measures the three-axis angular velocity and acceleration of an object. An IMU may be equipped with a three-axis gyroscope and a three-axis accelerometer to measure the angular velocity and acceleration of an object in three-dimensional space. In the strict sense, an IMU only provides users with three-axis angular velocity and three-axis acceleration data. The vertical reference unit (VRU) is based on the IMU, uses the gravity vector as a reference, and uses algorithms such as Kalman or complementary filtering to provide users with pitch angles, roll angles referenced by the gravity vector, and angles without reference standards. Heading. The so-called 6-axis attitude module belongs to this type of system. There is no reference for the heading angle. No matter where the module is facing, the heading angle will be 0° (or a set constant) after startup. As the working time of the module increases, the heading angle will slowly accumulate errors. Since the pitch angle and roll angle have a gravity vector reference, there will be no cumulative error for a long time under low maneuvering conditions. The attitude and heading reference system (AHRS) system adds a magnetometer or optical flow sensor to the VRU, and uses algorithms such as Kalman or complementary filtering to provide users with absolute reference pitch angles and roll angles. As well as heading angle equipment, this type of system is used to provide accurate and reliable attitude and navigation information for the aircraft. The 9-axis attitude sensor we usually call belongs to this type of system. Because the heading angle has a reference to the geomagnetic field, it will not drift. However, the geomagnetic field is very weak and is often interfered by surrounding magnetic objects. Therefore, how to resist various magnetic interferences under high maneuverability has become a hot topic in AHRS research. As shown in Figure 5, accelerometers, gyroscopes, magnetometers, etc. are used as basic inputs, and the attitude information, position information and speed required by the user are output through the data acquisition unit, calibration and compensation unit, data fusion unit, output and configuration unit information.
可以理解的是图4示意的结构并不构成对定点设备120的具体限定。在本申请另一些实施例中,定点设备120可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structure illustrated in FIG. 4 does not constitute a specific limitation on the pointing device 120 . In other embodiments of the present application, the pointing device 120 may include more or fewer components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.
视觉传感器130可以是经过训练具有获取三维数据能力的单目摄像机、立体相机、深度相机或激光雷达等等,此处不作具体限定。在一具体的实施例中,视觉传感器可以和隔空输入设备是集成设置的,也可以是分别单独设置的,此处不作具体限定。The visual sensor 130 may be a monocular camera, a stereo camera, a depth camera or a laser radar that has been trained to acquire three-dimensional data, and is not specifically limited here. In a specific embodiment, the visual sensor may be integrated with the air-to-air input device, or may be provided separately, which is not specifically limited here.
可以理解的是图2示意的结构并不构成对隔空输入系统的具体限定。在本申请另一些实施例中,隔空输入系统可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structure illustrated in Figure 2 does not constitute a specific limitation on the air-to-air input system. In other embodiments of the present application, the air-to-air input system may include more or less components than shown in the figures, or some components may be combined, or some components may be separated, or may be arranged differently. The components illustrated may be implemented in hardware, software, or a combination of software and hardware.
参见图6,图6是本申请提供的一种隔空输入系统的结构示意图。如图6所示,本申请提供的隔空输入系统,包括:带有定点痕迹确定功能的投影仪210、定点设备220以及视觉传感器230。Referring to Figure 6, Figure 6 is a schematic structural diagram of an air-to-air input system provided by the present application. As shown in Figure 6, the air-to-air input system provided by this application includes: a projector 210 with a function of determining fixed-point traces, a pointing device 220, and a visual sensor 230.
如图7所示,带有定点痕迹确定功能的投影仪210可包括:处理器212、存储器213、无线通信模块214、电源开关215、有线LAN通信模块216、HDMI通信模块217、光源控制器218和图像投影器219。其中:As shown in Figure 7, the projector 210 with fixed-point trace determination function may include: a processor 212, a memory 213, a wireless communication module 214, a power switch 215, a wired LAN communication module 216, an HDMI communication module 217, and a light source controller 218. and image projector 219. in:
处理器212可用于读取和执行计算机可读指令。具体实现中,处理器212可主要包括控制器、运算器和寄存器。其中,控制器主要负责指令译码,并为指令对应的操作发出控制信号。运算器主要负责执行定点或浮点算数运算操作、移位操作以及逻辑操作等,也可以执行地址运算和转换。寄存器主要负责保存指令执行过程中临时存放的寄存器操作数和中间操作结果等。Processor 212 may be used to read and execute computer-readable instructions. In specific implementation, the processor 212 may mainly include a controller, arithmetic unit, and a register. Among them, the controller is mainly responsible for decoding instructions and issuing control signals for operations corresponding to the instructions. The arithmetic unit is mainly responsible for performing fixed-point or floating-point arithmetic operations, shift operations, and logical operations. It can also perform address operations and conversions. Registers are mainly responsible for storing register operands and intermediate operation results temporarily stored during instruction execution.
在一些实施例中,处理器212可以用于解析无线通信模块214和/有线LAN通信模块216接收到的信号,如广播的探测请求,投影请求,云投影服务提供商的服务器发送的投影指令,等等。处理器212可以用于根据解析结果进行相应的处理操作,如生成探测响应,又如根据 该投影请求或投影指令驱动光源控制器218和图像投影器执行投影操作,等等。In some embodiments, the processor 212 may be used to parse signals received by the wireless communication module 214 and/or the wired LAN communication module 216, such as broadcast detection requests, projection requests, projection instructions sent by the server of the cloud projection service provider, etc. The processor 212 may be configured to perform corresponding processing operations according to the parsing results, such as generating a detection response, or according to The projection request or projection instruction drives the light source controller 218 and the image projector to perform projection operations, and so on.
在一些实施例中,处理器212还可以用于生成无线通信模块214和/有线LAN通信模块216向外发送的信号,如蓝牙广播信号、信标信号,又如向电子设备发送的用于反馈投影状态(如投影成功、投影失败等)的信号。In some embodiments, the processor 212 can also be used to generate signals sent externally by the wireless communication module 214 and/or the wired LAN communication module 216, such as Bluetooth broadcast signals, beacon signals, or for feedback sent to electronic devices. Signal of projection status (such as projection success, projection failure, etc.).
存储器213与处理器212耦合,用于存储各种软件程序和/或多组指令。具体实现中,存储器213可包括高速随机存取的存储器,并且也可包括非易失性存储器,例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备。存储器213可以存储操作系统,例如uCOS、VxWorks、RTLinux等嵌入式操作系统。存储器213还可以存储通信程序,该通信程序可用于与一个或多个服务器,或附加设备进行通信。Memory 213 is coupled to processor 212 for storing various software programs and/or sets of instructions. In specific implementations, the memory 213 may include high-speed random access memory, and may also include non-volatile memory, such as one or more disk storage devices, flash memory devices or other non-volatile solid-state storage devices. The memory 213 can store operating systems, such as uCOS, VxWorks, RTLinux and other embedded operating systems. Memory 213 may also store communications programs that may be used to communicate with one or more servers, or additional devices.
无线通信模块214可以包括蓝牙(BT)通信模块214A、WLAN通信模块214B中的一项或多项。The wireless communication module 214 may include one or more of a Bluetooth (BT) communication module 214A and a WLAN communication module 214B.
在一些实施例中,蓝牙(BT)通信模块、WLAN通信模块中的一项或多项可以监听到其他设备发射的信号,如探测请求、扫描信号等等,并可以发送响应信号,如探测响应、扫描响应等,使得其他设备可以发现投影仪210,并与其他设备建立无线通信连接,通过蓝牙或WLAN中的一种或多种无线通信技术与其他设备进行通信。In some embodiments, one or more of the Bluetooth (BT) communication module and the WLAN communication module can monitor signals transmitted by other devices, such as detection requests, scanning signals, etc., and can send response signals, such as detection responses. , scanning response, etc., so that other devices can discover the projector 210, establish wireless communication connections with other devices, and communicate with other devices through one or more wireless communication technologies in Bluetooth or WLAN.
在另一些实施例中,蓝牙(BT)通信模块、WLAN通信模块中的一项或多项也可以发射信号,如广播蓝牙信号、信标信号,使得其他设备可以发现投影仪210,并与其他设备建立无线通信连接,通过蓝牙或WLAN中的一种或多种无线通信技术与其他设备进行通信。In other embodiments, one or more of the Bluetooth (BT) communication module and the WLAN communication module can also transmit signals, such as broadcasting Bluetooth signals and beacon signals, so that other devices can discover the projector 210 and communicate with other devices. The device establishes a wireless communication connection and communicates with other devices through one or more wireless communication technologies such as Bluetooth or WLAN.
无线通信模块214还可以包括蜂窝移动通信模块(未示出)。蜂窝移动通信模块可以通过蜂窝移动通信技术与其他设备(如服务器)进行通信。Wireless communication module 214 may also include a cellular mobile communication module (not shown). The cellular mobile communication module can communicate with other devices (such as servers) through cellular mobile communication technology.
电源开关215可用于控制电源向投影仪210的供电。The power switch 215 may be used to control power supply to the projector 210 .
有线LAN通信模块216可用于通过有线LAN和同一个LAN中的其他设备进行通信,还可用于通过有线LAN连接到广域网,可与广域网中的设备通信。The wired LAN communication module 216 can be used to communicate with other devices in the same LAN through the wired LAN, and can also be used to connect to the wide area network through the wired LAN and communicate with devices in the wide area network.
HDMI通信模块217可用于通过HDMI接口(未示出)与其他设备进行通信。HDMI communication module 217 may be used to communicate with other devices through an HDMI interface (not shown).
图像投影器219可具有光源(未示出),可根据图像数据对从光源处射出的光进行调制并在屏幕上投影图像。The image projector 219 may have a light source (not shown), may modulate light emitted from the light source according to image data and project an image on the screen.
光源控制器218可用于控制图像投影器219具有的光源的点亮。The light source controller 218 may be used to control lighting of the light source provided by the image projector 219 .
可以理解的是图6示意的结构并不构成对带有定点痕迹确定功能的投影仪210的具体限定。在本申请另一些实施例中,带有定点痕迹确定功能的投影仪210可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structure illustrated in FIG. 6 does not constitute a specific limitation on the projector 210 with the function of determining fixed point traces. In other embodiments of the present application, the projector 210 with fixed point trace determination function may include more or less components than shown in the figure, or combine some components, or split some components, or arrange different components. . The components illustrated may be implemented in hardware, software, or a combination of software and hardware.
定点设备220以及视觉传感器230可以参见图2所示的定点设备120以及视觉传感器130。The pointing device 220 and the visual sensor 230 may refer to the pointing device 120 and the visual sensor 130 shown in FIG. 2 .
此外,定点痕迹确定功能可以由外设的定点痕迹确定设备实现,而没有跟投影仪集成在一起。该定点痕迹确定功能包括接收第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息,并根据第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息确定定点痕迹。投影仪可以用于在投影区域中显示所述定点痕迹。In addition, the fixed-point trace determination function can be implemented by a peripheral fixed-point trace determination device without being integrated with the projector. The fixed-point trace determination function includes receiving the position and orientation information of the first pointing device collected by the first visual sensor and the position and orientation information of the first pointing device collected by the first IMU, and based on the first fixed-pointing device collected by the first visual sensor. The posture information of the pointing device and the first IMU collect the posture information of the first pointing device to determine the pointing trace. A projector can be used to display the fixed point traces in the projection area.
参见图8,图8是本申请提供的一种隔空输入方法的流程示意图。如图8所示,本申请 提供的隔空输入方法,包括:Refer to Figure 8, which is a schematic flow chart of an air-to-air input method provided by the present application. As shown in Figure 8, this application Provided air-to-air input methods include:
S101:第一定点设备在空中进行三维定点运动,并通过第一定点设备中的第一IMU采集所述第一定点设备的位姿信息。S101: The first pointing device performs three-dimensional pointing movement in the air, and collects the pose information of the first pointing device through the first IMU in the first pointing device.
在一具体的实施例中,第一定点设备可以是图2或者图6所示的隔空输入系统中的定点设备。第一定点设备的具体构成请参见图2或者图6以及相关描述,此处不再展开赘述。In a specific embodiment, the first pointing device may be a pointing device in the air-to-air input system shown in FIG. 2 or FIG. 6 . Please refer to Figure 2 or Figure 6 and related descriptions for the specific structure of the first pointing device, and will not be described in detail here.
在一具体的实施例中,第一IMU采集的第一定点设备的位姿信息包括:第一定点设备的三维坐标以及第一定点设备的角度。结合图5,第一定点设备通过第一IMU中的陀螺仪、加速度计以及磁力计进行数据测量,并将测量到的数据依次输入数据采集单元、校准与补偿单元、数据融合单元以及输出与配置单元进行采集、校准与补偿、数据融合以及输出配置,从而得到第一定点设备的三维坐标(图5中的位置信息)以及第一定点设备的角度(图5中的姿态信息)。In a specific embodiment, the pose information of the first pointing device collected by the first IMU includes: the three-dimensional coordinates of the first pointing device and the angle of the first pointing device. Combined with Figure 5, the first pointing device measures data through the gyroscope, accelerometer and magnetometer in the first IMU, and inputs the measured data into the data acquisition unit, calibration and compensation unit, data fusion unit and output and The configuration unit performs acquisition, calibration and compensation, data fusion and output configuration to obtain the three-dimensional coordinates of the first pointing device (position information in Figure 5) and the angle of the first pointing device (attitude information in Figure 5).
下面将结合图9中的具体实施例进行说明。第一IMU采集的第一定点设备的位姿信息为(x_pen,y_pen,z_pen,yaw,roll,pitch),其中,x_pen为第一定点设备相对于起始时刻在x轴的坐标值,y_pen为第一定点设备相对于起始时刻在y轴的坐标值,z_pen为第一定点设备相对于起始时刻在z轴的坐标值,yaw为第一定点设备相对于起始时刻的俯仰角、roll为第一定点设备相对于起始时刻的横滚角以及pitch为第一定点设备相对于起始时刻的航向角。其中,第一IMU采集的原始数据为x轴、y轴以及z轴的加速度和x轴、y轴以及z轴的角速度,位姿信息x_pen是对x轴的加速度进行两次积分后得到的,位姿信息y_pen是对y轴的加速度进行两次积分后得到的,位姿信息z_pen是对z轴的加速度进行两次激愤后得到的,位姿信息yaw是对x轴的角速度进行一次积分后得到的,位姿信息roll是对y轴的角速度进行一次积分后得到的,位姿信息pitchl是对z轴的角速度进行一次积分后得到的。因为第一IMU采集的x轴、y轴以及z轴的加速度和x轴、y轴以及z轴的角速度均存在系统误差和随机误差,而积分的过程会放大这些误差,导致得到的位姿信息的误差比较大。The following will be described with reference to the specific embodiment in FIG. 9 . The pose information of the first pointing device collected by the first IMU is (x_pen, y_pen, z_pen, yaw, roll, pitch), where x_pen is the coordinate value of the first pointing device on the x-axis relative to the starting time, y_pen is the coordinate value of the first fixed point device on the y axis relative to the starting time, z_pen is the coordinate value of the first fixed point device on the z axis relative to the starting time, yaw is the coordinate value of the first fixed point device relative to the starting time is the pitch angle, roll is the roll angle of the first pointing device relative to the starting time, and pitch is the heading angle of the first pointing device relative to the starting time. Among them, the original data collected by the first IMU are the acceleration of the x-axis, y-axis and z-axis and the angular velocity of the x-axis, y-axis and z-axis. The pose information x_pen is obtained by integrating the acceleration of the x-axis twice. The pose information y_pen is obtained by integrating the acceleration of the y-axis twice. The pose information z_pen is obtained by integrating the acceleration of the z-axis twice. The pose information yaw is obtained by integrating the angular velocity of the x-axis once. Obtained, the pose information roll is obtained by integrating the angular velocity of the y-axis once, and the pose information pitchl is obtained by integrating the angular velocity of the z-axis. Because the acceleration of the x-axis, y-axis, and z-axis and the angular velocity of the x-axis, y-axis, and z-axis collected by the first IMU all have systematic errors and random errors, and the integration process will amplify these errors, resulting in the obtained pose information. The error is relatively large.
S102:第一视觉传感器采集所述第一定点设备的位姿信息。S102: The first visual sensor collects the pose information of the first pointing device.
在一具体的实施例中,第一视觉传感器可以是图2或者图6所示的隔空输入系统中的视觉传感器。第一视觉传感器的具体构成请参见图2或者图6以及相关描述,此处不再展开赘述。In a specific embodiment, the first visual sensor may be the visual sensor in the air-to-air input system shown in FIG. 2 or FIG. 6 . Please refer to Figure 2 or Figure 6 and related descriptions for the specific structure of the first visual sensor, which will not be described again here.
在一具体的实施例中,所述第一视觉传感器可以设置于所述显示区域中,也可以设置在所述显示区域之外。这里,当第一视觉传感器设置在所述显示区域中时,第一视觉传感器的视线轴心线和显示区域的法线的夹角为零,此时,可以简化后续计算的复杂度。当第一视觉传感器设置在显示区域之外时,可以增加第一视觉传感器设置的自由度,尤其是某些特定的场合中,第一视角传感器不方便设置在显示区域中时,还可以设置在显示区域之外。In a specific embodiment, the first visual sensor may be disposed in the display area or may be disposed outside the display area. Here, when the first visual sensor is disposed in the display area, the angle between the line of sight axis of the first visual sensor and the normal line of the display area is zero. In this case, the complexity of subsequent calculations can be simplified. When the first visual sensor is set outside the display area, the degree of freedom in setting the first visual sensor can be increased. Especially in some specific occasions, when it is inconvenient to set the first viewing angle sensor in the display area, it can also be set in the display area. outside the display area.
在一具体的实施例中,第一视觉传感器采集用户的三维数据以及第一定点设备的三维数据,此时采集的用户的三维数据以及第一定点设备的三维数据均是以第一视觉传感器的视点作为坐标原点,需要通过坐标变换关系,将以第一视觉传感器的视中心作为坐标原点的用户的三维数据以及第一定点设备的三维数据转化为以以显示区域的中心作为坐标原点的用户的三维数据以及第一定点设备的三维数据。根据转换后的用户的三维数据建立用户的三维模型,根据转换后的第一定点设备的三维数据建立第一定点设备的三维模型。然后,根据用户的三维模型以及第一定点设备的三维模型来确定第一定点设备的位姿信息。第一视觉传感器采集 的所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度。In a specific embodiment, the first visual sensor collects the three-dimensional data of the user and the three-dimensional data of the first pointing device. At this time, the three-dimensional data of the user and the three-dimensional data of the first pointing device are collected based on the first visual sensor. The viewpoint of the sensor is used as the origin of the coordinates, and the coordinate transformation relationship is needed to convert the user's three-dimensional data with the visual center of the first vision sensor as the coordinate origin and the three-dimensional data of the first pointing device into a coordinate origin with the center of the display area. The three-dimensional data of the user and the three-dimensional data of the first pointing device. A three-dimensional model of the user is established based on the converted three-dimensional data of the user, and a three-dimensional model of the first pointing device is established based on the converted three-dimensional data of the first pointing device. Then, determine the pose information of the first pointing device according to the three-dimensional model of the user and the three-dimensional model of the first pointing device. First vision sensor collection The pose information of the first pointing device includes the three-dimensional coordinates of the point in the three-dimensional model of the first pointing device, and the three-dimensional model of the first pointing device is relative to the user's wrist in the three-dimensional model. The rotation angle of the part.
当第一视觉传感设置在智慧屏中,即,第一视觉传感器集成在智慧屏中时,第一视觉传感器和智慧屏的显示区域的中心之间的位置关系是不会发生变化的,因此,可以在出厂前就提前进行标定,以确定第一视觉传感器的视中心到显示区域的中心之间的坐标变换关系,出厂后,该坐标变换关系不会发生变化。When the first visual sensor is installed in the smart screen, that is, when the first visual sensor is integrated in the smart screen, the positional relationship between the first visual sensor and the center of the display area of the smart screen will not change. Therefore, , it can be calibrated in advance before leaving the factory to determine the coordinate transformation relationship between the visual center of the first visual sensor and the center of the display area. This coordinate transformation relationship will not change after leaving the factory.
当第一视觉传感设置在智慧屏之外,即,第一视觉传感器是智慧屏的外部设备时,可以在智慧屏的外框设置一个或者几个放置位置,该一个或者几个放置位置和智慧屏的显示区域的中心之间的位置关系是不会发生变化的,因此,可以在出厂前就提前进行标定,以确定设置在该一个或者几个放置位置的第一视觉传感器到显示区域的中心之间的坐标变换关系,出厂后,该坐标变换关系不会发生变化。或者,第一视觉传感器设置在可以拍摄到智慧屏的显示区域的任何一个位置,然后,在智慧屏的显示区域播放标定图像,并根据标定图像进行标定,以确定第一视觉传感器的视中心到显示区域的中心之间的坐标变换关系。When the first visual sensor is set outside the smart screen, that is, the first visual sensor is an external device of the smart screen, one or several placement positions can be set on the outer frame of the smart screen. The one or several placement positions and The positional relationship between the centers of the display areas of the smart screen will not change. Therefore, it can be calibrated in advance before leaving the factory to determine the distance between the first visual sensor set at one or several placement positions and the display area. The coordinate transformation relationship between centers will not change after leaving the factory. Alternatively, the first visual sensor is set at any position that can capture the display area of the smart screen, and then the calibration image is played in the display area of the smart screen, and the calibration is performed based on the calibration image to determine the visual center of the first visual sensor. Displays the coordinate transformation relationship between the centers of the areas.
当第一视觉传感设置在带有定点痕迹确定功能的投影仪之外时,可以通过制定投影边界以设置一个或者几个放置位置,例如,可以将第一视觉传感设置在投影边界的左上角或者右上角。该一个或者几个放置位置和投影仪的显示区域的中心之间的相对位置关系是明确的,因此,可以在确定设置在该一个或者几个放置位置的第一视觉传感器到显示区域的中心之间的坐标变换关系。或者,第一视觉传感器设置在可以拍摄到投影仪的显示区域的任何一个位置,然后,在投影仪的显示区域播放标定图像,并根据标定图像进行标定,以确定第一视觉传感器的视中心到显示区域的中心之间的坐标变换关系。可以理解,当带有定点痕迹确定功能的投影仪采用定点痕迹确定设备和投影仪的组合进行替换时,坐标变换关系的标定方法也相类似,此处不再展开描述。When the first visual sensor is set outside the projector with fixed-point trace determination function, one or several placement positions can be set by formulating the projection boundary. For example, the first visual sensor can be set at the upper left of the projection boundary. corner or upper right corner. The relative positional relationship between the one or several placement positions and the center of the display area of the projector is clear. Therefore, it can be determined between the first visual sensor disposed at the one or several placement positions and the center of the display area. coordinate transformation relationship between them. Alternatively, the first visual sensor is set at any position that can capture the display area of the projector, and then the calibration image is played in the display area of the projector, and calibration is performed based on the calibration image to determine the visual center of the first visual sensor. Displays the coordinate transformation relationship between the centers of the areas. It can be understood that when the projector with the fixed-point trace determination function is replaced by a combination of the fixed-point trace determination device and the projector, the calibration method of the coordinate transformation relationship is also similar, and will not be described here.
以智慧屏的中心点为坐标原点(0,0,0),第一视觉传感器采集的第一定点设备的位姿信息为(x_tip,y_tip,z_tip,θ_x,θ_y,θ_z),其中,x_tip为第一定点设备的三维模型中的点相对于坐标原点(0,0,0)在x轴的坐标值,y_tip为第一定点设备的三维模型中的点相对于坐标原点(0,0,0)在y轴的坐标值,z_tip为第一定点设备的三维模型中的点相对于坐标原点(0,0,0)在z轴的坐标值,θ_x为第一定点设备的三维模型相对用户的三维模型中的手腕部位的俯仰角、θ_y为第一定点设备的三维模型相对用户的三维模型中的手腕部位的横滚角以及θ_z为第一定点设备的三维模型相对用户的三维模型中的手腕部位的航向角。Taking the center point of the smart screen as the coordinate origin (0, 0, 0), the pose information of the first pointing device collected by the first vision sensor is (x_tip, y_tip, z_tip, θ_x, θ_y, θ_z), where, x_tip is the coordinate value of the point in the three-dimensional model of the first pointing device relative to the coordinate origin (0, 0, 0) on the x-axis, y_tip is the point in the three-dimensional model of the first pointing device relative to the coordinate origin (0, 0, 0) is the coordinate value on the y-axis, z_tip is the coordinate value of the point in the three-dimensional model of the first pointing device relative to the coordinate origin (0, 0, 0) on the z-axis, θ_x is the coordinate value of the first pointing device The pitch angle of the three-dimensional model relative to the user's wrist in the three-dimensional model, θ_y is the roll angle of the three-dimensional model of the first pointing device relative to the wrist in the user's three-dimensional model, and θ_z is the relative angle of the three-dimensional model of the first pointing device. The heading angle of the wrist in the user's 3D model.
在一具体的实施例中,第一用户的三维模型中的手腕部位可以通过将第一用户的三维模型输入提取模型中提取出来的,其中,提取模型可以是深度学习(deep learning,DL)网络、卷积神经网络(convolutional neural networks,CNN)等等。输入提取模型可以使用大量的第二用户的三维模型和第二用户的三维模型的手腕部分进行训练得到的。第一用户的三维模型是通过第一视觉传感器采集得到的,第二用户的三维模型可以是通过第二视觉传感器采集得到的,第二用户的三维模型可以是人工标注得到的。这里,第一视觉传感器和第二视觉传感器可以是同一个视觉传感器,也可以是两个不同的视觉传感器,例如,第一视觉传感器和第二视觉传感器可以是同一个立体相机,也可以是两个不同的立体相机。或者,第一视觉传感器可以是立体相机,第二视觉传感器是激光雷达等等。类似地,第一IMU和第二IMU可 以是同一个视觉传感器,也可以是两个不同的视觉传感器。In a specific embodiment, the wrist part in the first user's three-dimensional model can be extracted by inputting the first user's three-dimensional model into an extraction model, where the extraction model can be a deep learning (DL) network. , convolutional neural networks (CNN) and so on. The input extraction model may be trained using a large number of three-dimensional models of the second user and wrist parts of the three-dimensional model of the second user. The three-dimensional model of the first user is collected by the first visual sensor, the three-dimensional model of the second user can be collected by the second visual sensor, and the three-dimensional model of the second user can be manually annotated. Here, the first visual sensor and the second visual sensor may be the same visual sensor, or they may be two different visual sensors. For example, the first visual sensor and the second visual sensor may be the same stereo camera, or they may be two different visual sensors. different stereo cameras. Alternatively, the first visual sensor can be a stereo camera, the second visual sensor a lidar, and so on. Similarly, the first IMU and the second IMU can So the same visual sensor can also be two different visual sensors.
在一具体的实施例中,第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度可以是通过以下这种方式确定的:将第一定点设备的三维模型中的点与用户的三维模型中的手腕部位中的点进行连线,然后,求出该连线相对于以手腕部位中的点为坐标原点的坐标系的旋转角。In a specific embodiment, the rotation angle of the three-dimensional model of the first pointing device relative to the wrist in the three-dimensional model of the user can be determined in the following manner: converting the points in the three-dimensional model of the first pointing device Connect a line with the point in the wrist of the user's three-dimensional model, and then find the rotation angle of the connecting line relative to the coordinate system with the point in the wrist as the coordinate origin.
在一具体的实施例中,第一定点设备的三维模型中的点可以是第一定点设备的三维模型中的任意点,例如,可以是第一定点设备的三维模型的顶点(例如,书写笔的三维模型中的笔尖),第一定点设备的三维模型的质心以及第一定点设备的三维模型的端点等等。在本申请的实施例中,都以第一定点设备的三维模型中的笔尖为例进行说明。In a specific embodiment, the point in the three-dimensional model of the first pointing device may be any point in the three-dimensional model of the first pointing device, for example, it may be a vertex of the three-dimensional model of the first pointing device (for example, , the pen tip in the three-dimensional model of the writing pen), the center of mass of the three-dimensional model of the first pointing device and the end point of the three-dimensional model of the first pointing device, and so on. In the embodiments of the present application, the pen tip in the three-dimensional model of the first pointing device is taken as an example for explanation.
在一具体的实施例中,用户的三维模型中的手腕部位中的点可以是用户的三维模型中的手腕部位中的任意点,例如,可以是用户的三维模型中的手腕部位中的中心点等等。其中,用户的三维模型中的手腕部位中的中心点可以是对用户的三维模型中的手腕部位的各个点得坐标值求平均得到的。上述例子中均以用户的三维模型中的手腕部位中的点为一个点进行说明,在其他的实施例中,也可以是两个点,三个点甚至更多点,此处不作具体限定。In a specific embodiment, the point in the wrist part of the user's three-dimensional model can be any point in the wrist part of the user's three-dimensional model. For example, it can be the center point of the wrist part in the user's three-dimensional model. etc. Wherein, the center point of the wrist part in the user's three-dimensional model may be obtained by averaging the coordinate values of each point of the wrist part in the user's three-dimensional model. In the above examples, the point in the user's wrist in the three-dimensional model is used as one point for explanation. In other embodiments, it can also be two points, three points or even more points, which are not specifically limited here.
可以理解,第一视觉传感器可以提供相对于显示区域的中心点的较为准确的绝对位姿信息,但是,第一视觉传感器通常无法提供连续的绝对位姿信息。因此,本发明实施例中,引入了第一IMU,该第一IMU可以连续提供相对于起始位置的较为准确的相对位姿信息。通过两种传感器的相互配合,既可以提供连续的位姿信息同时也可以避免单纯利用IMU而引发的系统误差和随机误差,能够提高确定空中的定点轨迹的准确性。It can be understood that the first visual sensor can provide relatively accurate absolute pose information relative to the center point of the display area, but the first visual sensor is usually unable to provide continuous absolute pose information. Therefore, in the embodiment of the present invention, a first IMU is introduced, which can continuously provide relatively accurate relative pose information relative to the starting position. Through the cooperation of the two sensors, it can not only provide continuous pose information, but also avoid the systematic errors and random errors caused by simply using the IMU, and improve the accuracy of determining the fixed-point trajectory in the air.
S103:第一定点设备将所述第一IMU采集的所述第一定点设备的位姿信息发送给隔空输入设备。相应地,隔空输入设备接收第一定点设备发送的所述第一IMU采集的所述第一定点设备的位姿信息。S103: The first pointing device sends the pose information of the first pointing device collected by the first IMU to the air-to-air input device. Correspondingly, the air-to-air input device receives the pose information of the first pointing device collected by the first IMU and sent by the first pointing device.
在一具体的实施例中,结合图3、图4以及图7,第一定点设备设置有无线通信模块或者USB接口中的一种或者多种。相应地,隔空输入设备也设置有无线通信模块以及USB通信模块中的一种或者多种。当第一定点设备和隔空输入设备之间采用无线方式进行通信时,第一定点设备和隔空输入设备之间没有数据连接的线,可以使得第一定点设备的使用更加方便。当第一定点设备和隔空输入设备之间采用USB方式进行通信时,可以使得第一定点设备和隔空输入设备之间的数据通信更加流畅。第一定点设备可以设置有无线通信模块或者USB接口中的一种或者多种。In a specific embodiment, with reference to Figures 3, 4 and 7, the first pointing device is provided with one or more of a wireless communication module or a USB interface. Correspondingly, the air-to-air input device is also provided with one or more of a wireless communication module and a USB communication module. When the first pointing device and the air-to-air input device communicate in a wireless manner, there is no data connection line between the first pointing device and the air-to-air input device, which can make the use of the first pointing device more convenient. When the first pointing device and the air-to-air input device communicate using USB, the data communication between the first pointing device and the air-to-air input device can be made smoother. The first pointing device may be provided with one or more of a wireless communication module or a USB interface.
S104:第一视觉传感器采集到的所述第一定点设备的位姿信息发送给隔空输入设备。相应地,隔空输入设备接收第一视觉传感器发送的所述视觉传感器采集的所述第一定点设备的位姿信息。S104: The pose information of the first pointing device collected by the first visual sensor is sent to the air-to-air input device. Correspondingly, the air-to-air input device receives the position and orientation information of the first pointing device collected by the visual sensor and sent by the first visual sensor.
在一具体的实施例中,第一视觉传感器设置有无线通信模块、USB接口或者有线LAN通信模块、HDMI通信模块中的一种或者多种。相应地,隔空输入设备也设置有无线通信模块以及USB通信模块。当第一视觉传感器和隔空输入设备之间采用无线方式进行通信时,第一视觉传感器和隔空输入设备之间没有数据连接的线,可以使得第一视觉传感器的使用更加方便。当第一视觉传感器和隔空输入设备之间采用USB方式、有线LAN方式或者HDMI方式进行通信时,可以使得第一视觉传感器和隔空输入设备之间的数据通信更加流畅。In a specific embodiment, the first visual sensor is provided with one or more of a wireless communication module, a USB interface, a wired LAN communication module, and an HDMI communication module. Correspondingly, the air-to-air input device is also provided with a wireless communication module and a USB communication module. When the first visual sensor and the air-to-air input device communicate in a wireless manner, there is no data connection line between the first visual sensor and the air-to-air input device, which can make the use of the first visual sensor more convenient. When the first vision sensor and the air-to-air input device communicate using USB, wired LAN or HDMI, the data communication between the first vision sensor and the air-to-air input device can be made smoother.
S105:隔空输入设备在显示区域中确定定点起点。 S105: The air-to-air input device determines the starting point of the fixed point in the display area.
在一具体的实施例中,隔空输入设备可以是图2中的智慧屏或者图6中的带有定点痕迹确定功能的投影仪,或者由定点痕迹确定设备以及投影仪组成的隔空输入设备等等。隔空输入设备的具体构成请参见图2或者图6以及相关描述,此处不再展开赘述。In a specific embodiment, the air-to-air input device may be the smart screen in Figure 2 or the projector with fixed-point trace determination function in Figure 6, or an air-to-air input device composed of a fixed-point trace determination device and a projector. etc. Please refer to Figure 2 or Figure 6 and related descriptions for the specific structure of the air-to-air input device, which will not be described again here.
在一具体的实施例中,显示区域可以是能够显示定点轨迹的区域,例如,可以是智慧屏的可显示区域,或者,投影仪的投影区域等等。In a specific embodiment, the display area may be an area capable of displaying a fixed-point trajectory, for example, it may be a displayable area of a smart screen, or a projection area of a projector, or the like.
在一具体的实施例中,隔空输入设备在显示区域中确定定点起点,具体可以是:隔空输入设备的处理器从隔空输入设备的存储器获取从第一视觉传感器采集到的用户的三维数据、第一定点设备的三维数据。隔空输入设备的处理器根据用户的三维数据以及第一定点设备的三维数据分别建立用户的三维模型以及第一定点设备的三维模型。然后,隔空输入设备的处理器将用户的三维模型或所述第一定点设备的三维模型中的特定部位的法向量与所述显示区域的交点确定为定点起点。这里,用户的三维模型的特定部位可以是用户的三维模型中的任意部位,例如,眼睛部位、鼻尖部位、手指尖端部位等等。并且,可以结合不同的条件设置用户的三维模型的特定部位的优先等级,例如,当用户是用手指指向显示区域时,优先使用手指尖端部位来确定定点起点,当用户没有用手指指向显示区域时,优先使用眼睛部位来确定定点起点。第一定点设备的三维模型中的特定部位可以是第一定点设备的三维模型中的任意部位,例如,端点部位、中心部位等等。在一更具体的实施例中,当第一定点设备是书写笔时,第一定点设备的三维模型中的点可以是书写笔的笔尖。下面将结合图10中的具体实施例进行说明。以智慧屏的中心点为坐标原点(0,0,0),用户的三维模型中眼睛部分的坐标为(x_eye,y_eye,z_eye),法向量为(x_n,y_n,z_n),则定点起点为用户的三维模型中眼睛部分的法向量(x_n,y_n,z_n)与智慧屏的交点(x,y,0)。在现有的技术方案中,是根据第一定点设备中的IMU的位姿信息来确定定点起点的,因此,根据IMU的位姿信息来确定的定点起点往往不是用户想要的定点起点,用户往往需要多次调整定点设备才能够定点到用户想要的定点起点,造成效率非常低下。而在本申请中,是结合视觉传感器来确定人的眼睛所注视的部位来确定绘制起来,能够更加准确以及高效地确定用户想要的定点起点,而且,用户眼睛看哪里,定点起点就在哪里,使用起来非常方便以及人性化,对用户非常友好。另外,因为通过视觉传感器采集三维数据建模具有准确度非常高的特点,通过视觉传感器采集的三维数据建立的用户的三维模型或所述第一定点设备的三维模型来确定定点起点的准确性也非常高。In a specific embodiment, the air-to-air input device determines the starting point of the fixed point in the display area, specifically: the processor of the air-to-air input device obtains the user's three-dimensional image collected from the first visual sensor from the memory of the air-to-air input device. Data, three-dimensional data of the first pointing device. The processor of the air-to-air input device respectively establishes a three-dimensional model of the user and a three-dimensional model of the first pointing device based on the three-dimensional data of the user and the three-dimensional data of the first pointing device. Then, the processor of the air-to-air input device determines the intersection of the normal vector of the specific part in the user's three-dimensional model or the three-dimensional model of the first pointing device and the display area as the starting point of the fixed point. Here, the specific part of the user's three-dimensional model may be any part of the user's three-dimensional model, for example, the eye part, the nose tip part, the finger tip part, etc. Moreover, the priority of specific parts of the user's three-dimensional model can be set based on different conditions. For example, when the user points to the display area with a finger, the tip of the finger is used first to determine the starting point of the fixed point. When the user does not point to the display area with a finger, , using the eye part first to determine the starting point of the fixed point. The specific part in the three-dimensional model of the first pointing device may be any part in the three-dimensional model of the first pointing device, for example, an endpoint part, a center part, etc. In a more specific embodiment, when the first pointing device is a writing pen, the point in the three-dimensional model of the first pointing device may be the tip of the writing pen. The following will be described with reference to the specific embodiment in FIG. 10 . Taking the center point of the smart screen as the coordinate origin (0, 0, 0), the coordinates of the eye part in the user's three-dimensional model are (x_eye, y_eye, z_eye), and the normal vector is (x_n, y_n, z_n), then the starting point of the fixed point is The intersection point (x, y, 0) of the normal vector (x_n, y_n, z_n) of the user's eye part in the user's three-dimensional model and the smart screen. In the existing technical solution, the starting point of the fixed point is determined based on the pose information of the IMU in the first pointing device. Therefore, the starting point of the fixed point determined based on the pose information of the IMU is often not the starting point of the fixed point desired by the user. Users often need to adjust the pointing device multiple times to reach the starting point they want, resulting in very low efficiency. In this application, visual sensors are used to determine the part where people's eyes are looking at to determine the drawing, which can more accurately and efficiently determine the starting point of the fixed point that the user wants. Moreover, wherever the user's eyes look, the starting point of the fixed point is located. , it is very convenient and user-friendly to use. In addition, because the three-dimensional data modeling collected by the visual sensor has the characteristics of very high accuracy, the user's three-dimensional model established by the three-dimensional data collected by the visual sensor or the three-dimensional model of the first pointing device is used to determine the accuracy of the starting point of the fixed point. Also very high.
S106:隔空输入设备确定所述第一定点设备在空中的定点轨迹。S106: The airborne input device determines the fixed-point trajectory of the first pointing device in the air.
在一具体的实施例中,隔空输入设备确定所述第一定点设备在空中的定点轨迹,具体为:在第一定点设备在空中运动过程中,隔空输入设备的处理器根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹。其中,所述映射关系中的第一位姿信息包括第二视觉传感器采集的位姿信息和第二IMU采集的位姿信息。In a specific embodiment, the air-to-air input device determines the fixed-point trajectory of the first pointing device in the air, specifically: during the movement of the first pointing device in the air, the processor of the air-to-air input device determines the fixed-point trajectory of the first pointing device in the air according to the first The posture information of the first pointing device collected by a visual sensor, the posture information of the first pointing device collected by the IMU in the first pointing device, and the first posture information is consistent with the first pointing device. The mapping relationship between point trajectories determines the fixed-point trajectory of the first pointing device in the air. Wherein, the first posture information in the mapping relationship includes the posture information collected by the second visual sensor and the posture information collected by the second IMU.
在一更具体的实施例中,隔空输入设备的处理器根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的第一IMU采集的所述第一定点设备的位姿信息以及轨迹预测模型确定所述第一定点设备在空中的定点轨迹。其中,轨迹预测模型是根据第二视觉传感器采集的第一位姿信息以及第二IMU采集的位姿信息以及第一定点轨迹进行训练得 到的。轨迹预测模型可以是(deep neural networks,DNN)、线性回归模型等等。In a more specific embodiment, the processor of the air-to-air input device uses the posture information of the first pointing device collected by the first visual sensor and all the information collected by the first IMU in the first pointing device. The posture information and trajectory prediction model of the first pointing device determine the fixed-point trajectory of the first pointing device in the air. Among them, the trajectory prediction model is trained based on the first posture information collected by the second vision sensor, the posture information collected by the second IMU, and the first fixed point trajectory. Arrived. The trajectory prediction model can be (deep neural networks, DNN), linear regression model, etc.
在一具体的实施例中,轨迹预测模型可以表示为:
y=f(State1,State2)
In a specific embodiment, the trajectory prediction model can be expressed as:
y=f(State 1 , State 2 )
其中,y为第一定点设备在空中的定点轨迹,state1为第一视觉传感器采集的所述第一定点设备的位姿信息,state2为第一IMU采集的第一定点设备的位姿信息,f()为映射关系。在一具体的实施例中,state1为第一视觉传感器采集的第一定点设备的位姿信息为(x_tip,y_tip,z_tip,θ_x,θ_y,θ_z),tats 2为第一IMU采集的第一定点设备的位姿信息为(x_pen,y_pen,z_pen,yaw,roll,pitch)。Among them, y is the fixed-point trajectory of the first pointing device in the air, state 1 is the pose information of the first pointing device collected by the first visual sensor, and state 2 is the position and orientation information of the first pointing device collected by the first IMU. Pose information, f() is the mapping relationship. In a specific embodiment, state 1 is the pose information of the first pointing device collected by the first visual sensor (x_tip, y_tip, z_tip, θ_x, θ_y, θ_z), and tats 2 is the position information of the first pointing device collected by the first IMU. The pose information of the device at a certain point is (x_pen, y_pen, z_pen, yaw, roll, pitch).
在一更具体的实施例中,如图11所示,轨迹预测模型可以包括输入层、隐藏层以及输出层。In a more specific embodiment, as shown in Figure 11, the trajectory prediction model may include an input layer, a hidden layer and an output layer.
输入层:Input layer:
假设输入层的输入为S1,S2,其中,S1是第一视觉传感器采集的所述第一定点设备的位姿信息,S2是第一IMU采集的第一定点设备的位姿信息,输出和输入相等,即,不对输入进行任何处理。为了陈述简便,此处假设输入层不作任何处理,但是,在实际应用中,可以对输入层进行归一化等等处理,此处不作具体限定。Assume that the inputs of the input layer are S 1 and S 2 , where S 1 is the pose information of the first pointing device collected by the first visual sensor, and S 2 is the position of the first pointing device collected by the first IMU. pose information, the output and input are equal, that is, no processing is performed on the input. For simplicity of description, it is assumed here that the input layer does not perform any processing. However, in actual applications, the input layer can be normalized and processed without specific limitations here.
隐藏层:Hidden layer:
将输入层输出的S1,S2作为隐藏层的输入,假设总共L(L≥2)层隐藏层,设Zl表示第l层的输出结果,当l=1时,其中,1≤l≤L,那么,第l层和第l+1层之间的关系为:

Take S 1 and S 2 output by the input layer as the input of the hidden layer. Assume a total of L (L≥2) hidden layers. Let Z l represent the output result of the lth layer. When l = 1, Among them, 1≤l≤L, then the relationship between the lth layer and the l+1th layer is:

其中,为第l层的的权值向量,W2 l为第l层的Z2 l的权值向量,bl为第l层的偏置向量,al+1为第l+1层的中间向量,为第l+1层的第一激励函数,为第l+1层的第二激励函数,Zl+1为第l+1层的隐藏层结果。第一激励函数和第二激励函数可以是sigmoid函数,双曲正切函数,Relu函数等等中的任意一种。in, For the l level The weight vector of , W 2 l is the weight vector of Z 2 l of the l-th layer, b l is the bias vector of the l-th layer, a l+1 is the intermediate vector of the l+1-th layer, is the first excitation function of layer l+1, is the second excitation function of the l+1th layer, and Z l+1 is the hidden layer result of the l+1th layer. The first excitation function and the second excitation function may be any one of sigmoid function, hyperbolic tangent function, Relu function, etc.
输出层:Output layer:
假设第L层的输出结果ZL具体为:
Assume that the output result Z L of layer L is specifically:
其中,为第L-1层的的权值向量,W2 l为第l层的Z2 l的权值向量,bl为第l层的偏置向量,in, For the L-1 floor The weight vector of , W 2 l is the weight vector of Z 2 l of the l-th layer, b l is the bias vector of the l-th layer,
在一具体的实施例中,在使用轨迹预测模型之前,需要先对轨迹预测模型进行训练。对轨迹预测模型进行训练的过程具体为:获取大量第一位姿信息和对应的第一定点轨迹。其中,第一位姿信息包括第二视觉传感器采集第二定点设备的位姿信息、第二IMU采集的第二定点设备的位姿信息和对应的第一定点轨迹。然后,将第二视觉传感器采集的第二定点设备的位姿信息、第二IMU采集的第二定点设备的位姿信息和对应的第一定点轨迹多次输入未训练好的轨迹预测模型进行重复训练,直到轨迹预测模型能够正确对定点轨迹进行预测。对于单次训练来说,因为轨迹预测模型的输出尽可能的接近真正想要预测的值,所以,可以将第二视觉传感器采集的第二定点设备的位姿信息、第二IMU采集的第二定点设备的位姿信息输入轨迹预测模型,从而得到该数据的预测值,并将第一定点轨迹作为真正想要的目标值,比较当 前的预测值和真正想要的目标值,再根据两者之间的差异情况来更新轨迹预测模型中的深度神经网络的每一层的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为轨迹预测模型中的各层预先配置参数),比如,如果轨迹预测模型的预测值高了,就调整权重向量让它预测低一些,不断的调整,直到轨迹预测模型能够预测出真正想要的目标值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么轨迹预测模型的训练就变成了尽可能缩小这个loss的过程。In a specific embodiment, before using the trajectory prediction model, the trajectory prediction model needs to be trained first. The specific process of training the trajectory prediction model is to obtain a large amount of first attitude information and the corresponding first fixed point trajectory. The first posture information includes the posture information of the second pointing device collected by the second visual sensor, the posture information of the second pointing device collected by the second IMU, and the corresponding first pointing trajectory. Then, input the pose information of the second fixed-point device collected by the second visual sensor, the pose information of the second fixed-point device collected by the second IMU and the corresponding first fixed-point trajectory multiple times into the untrained trajectory prediction model. Repeat the training until the trajectory prediction model can correctly predict the fixed-point trajectory. For a single training, because the output of the trajectory prediction model is as close as possible to the value that is actually expected to be predicted, the pose information of the second fixed-point device collected by the second visual sensor and the second position and orientation information collected by the second IMU can be combined. The pose information of the fixed-point device is input into the trajectory prediction model to obtain the predicted value of the data, and the first fixed-point trajectory is used as the real desired target value, which is more accurate. The weight vector of each layer of the deep neural network in the trajectory prediction model is updated based on the difference between the previous predicted value and the really desired target value (of course, there is usually a The initialization process is to pre-configure parameters for each layer in the trajectory prediction model). For example, if the prediction value of the trajectory prediction model is high, adjust the weight vector to make it predict a lower value. Continue to adjust until the trajectory prediction model can predict Find the target value you really want. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value". This is the loss function (loss function) or objective function (objective function), which is used to measure the difference between the predicted value and the target value. Important equations. Among them, taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference. Then the training of the trajectory prediction model becomes a process of reducing this loss as much as possible.
这里,第一视觉传感器和第二视觉传感器可以是同一个视觉传感器,也可以是两个不同的视觉传感器,例如,第一视觉传感器和第二视觉传感器可以是同一个立体相机,也可以是两个不同的立体相机。或者,第一视觉传感器可以是立体相机,第二视觉传感器是激光雷达等等。类似地,第一定点设备和第二定点设备可以是同一个设备,也可以不是同一个设备。当第一定点设备和第二定点设备是同一个设备时,第一IMU和第二IMU是同一个IMU,当第一定点设备和第二定点设备是两个不同的设备时,第一IMU和第二IMU是两个不同的IMU。在一些种可能的场景中,第一视觉传感器和第二视觉传感器是同一个视觉传感器,但是,第一定点设备和第二定点设备是不同的定点设备。或者,第一视觉传感器和第二视觉传感器是同一个视觉传感器,第一定点设备和第二定点设备是同一个定点设备。或者,第一视觉传感器和第二视觉传感器是不同的视觉传感器,但是,第一定点设备和第二定点设备是同一个定点设备。Here, the first visual sensor and the second visual sensor may be the same visual sensor, or they may be two different visual sensors. For example, the first visual sensor and the second visual sensor may be the same stereo camera, or they may be two different visual sensors. different stereo cameras. Alternatively, the first visual sensor can be a stereo camera, the second visual sensor a lidar, and so on. Similarly, the first pointing device and the second pointing device may be the same device, or may not be the same device. When the first pointing device and the second pointing device are the same device, the first IMU and the second IMU are the same IMU. When the first pointing device and the second pointing device are two different devices, the first The IMU and the second IMU are two different IMUs. In some possible scenarios, the first visual sensor and the second visual sensor are the same visual sensor, but the first pointing device and the second pointing device are different pointing devices. Alternatively, the first visual sensor and the second visual sensor are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device. Alternatively, the first visual sensor and the second visual sensor are different visual sensors, but the first pointing device and the second pointing device are the same pointing device.
通过训练获得映射关系和使用映射关系确定定点轨可以是同时发生的,也可以是不同时发生的。Obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory may occur at the same time or not at the same time.
当通过训练获得映射关系和使用映射关系确定定点轨是同时发生时,训练获得映射关系和使用映射关系确定定点轨迹是在相同的时空发生的,因此,在使用映射关系确定定点轨迹的同时,映射关系可以不断发生变化,以在使用映射关系的过程中,不断提高映射关系的准确性。在这种场景中,在使用映射关系确定定点轨迹时用到的第一视觉传感器采集到的第一定位设备的位姿信息,第一IMU采集到的第一定位设备的位姿信息可以作为第一位姿信息,使用映射关系确定的定点轨迹可以作为第一定点轨迹来对隔空输入设备中的轨迹预测模型进行训练。When obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory occur at the same time, training to obtain the mapping relationship and using the mapping relationship to determine the fixed-point trajectory occur in the same time and space. Therefore, while using the mapping relationship to determine the fixed-point trajectory, mapping The relationship can continue to change to continuously improve the accuracy of the mapping relationship in the process of using the mapping relationship. In this scenario, the posture information of the first positioning device collected by the first visual sensor and the posture information of the first positioning device collected by the first IMU can be used as the third positioning device when determining the fixed-point trajectory using the mapping relationship. Based on the attitude information, the fixed-point trajectory determined using the mapping relationship can be used as the first fixed-point trajectory to train the trajectory prediction model in the air-to-air input device.
当通过训练获得映射关系和使用映射关系确定定点轨迹是不同时发生时,可以先通过训练获得映射关系,然后,再使用映射关系确定定点轨迹。即,通过训练获得映射关系是在过去的时空发生的,而,使用映射关系确定定点轨迹是在现在的时空发生的。此时,可以由专门的训练设备承担训练获得映射关系的任务,来减轻隔空输入设备的负荷。当然,也可以根据实际情况,而仍然设置由隔空输入设备来训练获得映射关系。换言之,是通过历史采集的数据来训练获得映射关系的。在这种场景中,可以采用机械手臂控制所述第二定点设备在空中进行定点运动。即,可以通过机械手臂模拟人的手臂在空中进行三维定点运动。为了更好地模拟人的手臂,机械手臂可以采用媲美真人手臂的仿生手臂。机械手臂控制第二定点设备在空中进行三维定点运动得到的轨迹就是第一定点轨迹。在机械手臂控制所述第二定点设备在空中进行定点运动时,第二视觉传感器采集到机械手臂的三维数据和第二定点设备的三维数据,根据机械手臂的三维数据建立机械手臂的三维模型,根据第二定点设备的三维数据建 立第二定点设备的三维模型,然后,根据机械手臂的三维模型以及第二定点设备的三维模型来确定的第二定点设备的位姿信息(具体请参见步骤S102中第一视觉传感器确定第一定点设备的位置信息的过程);以及,机械手臂控制第二定点设备在空中进行定点运动时,第二定点设备中的第二IMU采集的第二定点设备的位姿信息(具体请参见步骤S101中第一定点设备的第一IMU采集所述第一定点设备的位姿信息的过程)。第二视觉传感器采集到的第二定点设备的位姿信息以及第二定点设备中的第二IMU采集的第二定点设备的位姿信息统称为第一位姿信息。机械手臂将第一定点轨迹发送给隔空输入设备或者训练设备,第二视觉传感器将第二视觉传感器采集到的第二定点设备的位姿信息发送给隔空输入设备或者训练设备,第二定点设备将第二定点设备中的第二IMU采集的第二定点设备的位姿信息发送给隔空输入设备或者训练设备来对轨迹预测模型进行训练。When obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory do not occur at the same time, the mapping relationship can be obtained through training first, and then the mapping relationship is used to determine the fixed-point trajectory. That is, obtaining the mapping relationship through training occurs in the past time and space, while using the mapping relationship to determine the fixed-point trajectory occurs in the present time and space. At this time, a specialized training device can be tasked with training to obtain the mapping relationship to reduce the load on the air-to-air input device. Of course, according to the actual situation, the mapping relationship can still be obtained by training through the air input device. In other words, the mapping relationship is obtained through training using historically collected data. In this scenario, a robotic arm can be used to control the second pointing device to perform fixed-point movement in the air. That is, a robotic arm can be used to simulate a human arm's three-dimensional fixed-point movement in the air. In order to better simulate the human arm, the robotic arm can use a bionic arm that is comparable to a human arm. The trajectory obtained by the robot arm controlling the second fixed-point device to perform three-dimensional fixed-point movement in the air is the first fixed-point trajectory. When the robotic arm controls the second fixed-point device to perform fixed-point movement in the air, the second visual sensor collects the three-dimensional data of the robotic arm and the three-dimensional data of the second fixed-point device, and establishes a three-dimensional model of the robotic arm based on the three-dimensional data of the robotic arm. Based on the three-dimensional data of the second fixed-point device, Establish a three-dimensional model of the second pointing device, and then determine the pose information of the second pointing device based on the three-dimensional model of the robotic arm and the three-dimensional model of the second pointing device (for details, please refer to step S102 in which the first visual sensor determines the first The process of position information of the pointing device); and, when the robot arm controls the second pointing device to perform fixed-point movement in the air, the second IMU in the second pointing device collects the pose information of the second pointing device (for details, please refer to the steps The process in which the first IMU of the first pointing device collects the pose information of the first pointing device in S101). The pose information of the second pointing device collected by the second visual sensor and the pose information of the second pointing device collected by the second IMU in the second pointing device are collectively referred to as the first pose information. The robotic arm sends the first fixed-point trajectory to the air-to-air input device or training device, and the second visual sensor sends the pose information of the second fixed-point device collected by the second visual sensor to the air-to-air input device or training device. The pointing device sends the pose information of the second pointing device collected by the second IMU in the second pointing device to the air-to-air input device or the training device to train the trajectory prediction model.
上述例子中,是假设第一视觉传感器和第二视觉传感器的位置不变的情况下进行说明的,例如,第一视觉传感器可以设置在智慧屏的上边缘的正中,第二视觉传感器也可以设置在智慧屏的上边缘的正中。但是在实际应用中,第一视觉传感器的位置和第二视觉传感器的位置也可能发生变化。例如,第一视觉传感器可以设置在智慧屏的上边缘的正中,第二视觉传感器可以设置在智慧屏的左上角。此时,需要重新标定第二视角传感器和显示区域中心的转换关系,并通过该转换关系将第二视角传感器采集的以第二视角传感器的视中心为坐标原点的机械手臂的三维数据和第二定点设备的三维数据,转换为以显示区域的中心为坐标原点的机械手臂的三维数据和第二定点设备的三维数据。In the above example, it is assumed that the positions of the first visual sensor and the second visual sensor remain unchanged. For example, the first visual sensor can be placed in the middle of the upper edge of the smart screen, and the second visual sensor can also be placed. In the middle of the upper edge of the smart screen. However, in actual applications, the positions of the first visual sensor and the second visual sensor may also change. For example, the first visual sensor can be disposed in the center of the upper edge of the smart screen, and the second visual sensor can be disposed in the upper left corner of the smart screen. At this time, it is necessary to recalibrate the conversion relationship between the second perspective sensor and the center of the display area, and use this conversion relationship to combine the three-dimensional data of the robotic arm collected by the second perspective sensor with the visual center of the second perspective sensor as the coordinate origin and the second perspective sensor. The three-dimensional data of the fixed-point device is converted into the three-dimensional data of the robot arm with the center of the display area as the coordinate origin and the three-dimensional data of the second fixed-point device.
通过训练获得映射关系和使用映射关系确定定点轨迹是同时发生时,往往使用的第一视觉传感器和第二视觉传感器是同一个视觉传感器,第一定点设备和第二定点设备是同一个定点设备。通过训练获得映射关系和使用映射关系确定定点轨是不同时发生时,第一视觉传感器和第二视觉传感器可以是同一个视觉传感器,但是,第一定点设备和第二定点设备可以是不同的定点设备。或者,第一视觉传感器和第二视觉传感器是同一个视觉传感器,第一定点设备和第二定点设备是同一个定点设备。或者,第一视觉传感器和第二视觉传感器是不同的视觉传感器,但是,第一定点设备和第二定点设备是同一个定点设备。或者,第一视觉传感器和第二视觉传感器是不同的视觉传感器,第一定点设备和第二定点设备也是不同的定点设备。When obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory occur at the same time, the first visual sensor and the second visual sensor often used are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device. . When obtaining the mapping relationship through training and using the mapping relationship to determine the fixed-point trajectory do not occur at the same time, the first visual sensor and the second visual sensor may be the same visual sensor, but the first pointing device and the second pointing device may be different. Pointing device. Alternatively, the first visual sensor and the second visual sensor are the same visual sensor, and the first pointing device and the second pointing device are the same pointing device. Alternatively, the first visual sensor and the second visual sensor are different visual sensors, but the first pointing device and the second pointing device are the same pointing device. Alternatively, the first visual sensor and the second visual sensor are different visual sensors, and the first pointing device and the second pointing device are also different pointing devices.
S107:隔空输入设备根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹。S107: The air-to-air input device determines the fixed-point trace in the display area based on the fixed-point starting point and the fixed-point trajectory.
在一具体的实施例中,隔空输入设备的处理器根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹。In a specific embodiment, the processor of the air-to-air input device determines the fixed-point trace in the display area based on the fixed-point starting point and the fixed-point trajectory.
S108:隔空输入设备在显示区域显示定点痕迹。S108: The air-to-air input device displays fixed-point traces in the display area.
在一具体的实施例中,当隔空输入设备是智慧屏时,可以在智慧屏的显示器的显示区域显示定点痕迹;当隔空输入设备是带有定点痕迹确定功能的投影仪或者带有定点痕迹确定功能的设备以及投影仪组成的隔空输入设备时,可以通过投影仪的图像投影器将定点痕迹投影至投影区域。In a specific embodiment, when the air-to-air input device is a smart screen, fixed-point traces can be displayed in the display area of the display of the smart screen; when the air-to-air input device is a projector with a fixed-point trace determination function or a fixed-point trace When a device with a trace determination function and an air-to-air input device composed of a projector are used, fixed-point traces can be projected to the projection area through the image projector of the projector.
上述实施例中,是以隔空输入设备可以执行接收第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息,并根据第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息确定定点痕迹,投影 仪可以用于在投影区域中显示所述定点痕迹,以及在显示区域显示定点痕迹的步骤为例进行说明的。在一种可能的实施例中,隔空输入设备可以包括定点痕迹确定设备以及投影仪,其中,定点痕迹确定设备用于接收第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息,并根据第一视觉传感器采集的第一定点设备的位姿信息以及第一IMU采集所述第一定点设备的位姿信息确定定点痕迹,投影仪可以用于在投影区域中显示所述定点痕迹,此处不作具体限定。In the above embodiment, the air-to-air input device can receive the posture information of the first pointing device collected by the first visual sensor and the first IMU collects the posture information of the first pointing device, and performs the processing according to the first The position and orientation information of the first pointing device collected by the visual sensor and the position and orientation information of the first pointing device collected by the first IMU determine the fixed point trace, and project The instrument can be used to display the fixed-point traces in the projection area, and the steps of displaying the fixed-point traces in the display area are explained as examples. In a possible embodiment, the air-to-air input device may include a fixed-point trace determination device and a projector, wherein the fixed-point trace determination device is configured to receive the pose information of the first pointing device collected by the first visual sensor and the first The IMU collects the pose information of the first pointing device, and determines the fixing trace based on the pose information of the first pointing device collected by the first visual sensor and the pose information of the first pointing device collected by the first IMU. , the projector can be used to display the fixed point trace in the projection area, which is not specifically limited here.
上述实施例中,步骤S102以及S104中是第一视觉传感器采集用户的三维数据以及第一定点设备的三维数据,进而根据用户的三维数据以及第一定点设备的三维数据确定第一定点设备的位姿信息,在其他的实施例中,也可以第一视觉传感器采集用户的三维数据以及第一定点设备的三维数据,将第一视觉传感器采集的用户的三维数据以及第一定点设备的三维数据发送给隔空输入设备,并由隔空输入设备根据用户的三维数据以及第一定点设备的三维数据确定第一定点设备的位姿信息。同理,第二视觉传感器也可以将采集机械手臂的三维数据以及第二定点设备的三维数据发送给隔空输入设备或者训练设备,并由隔空输入设备或者训练设备根据机械手臂的三维数据以及第二定点设备的三维数据确定第二定点设备的位姿信息。In the above embodiment, in steps S102 and S104, the first visual sensor collects the three-dimensional data of the user and the three-dimensional data of the first pointing device, and then determines the first fixed point based on the three-dimensional data of the user and the three-dimensional data of the first pointing device. For the pose information of the device, in other embodiments, the first visual sensor can also collect the three-dimensional data of the user and the three-dimensional data of the first fixed point device, and the three-dimensional data of the user and the first fixed point collected by the first visual sensor can be used. The three-dimensional data of the device is sent to the air-to-air input device, and the air-to-air input device determines the pose information of the first pointing device based on the user's three-dimensional data and the three-dimensional data of the first pointing device. In the same way, the second vision sensor can also collect the three-dimensional data of the robotic arm and the three-dimensional data of the second fixed-point device and send it to the air-to-air input device or training device, and the air-to-air input device or training device can collect the three-dimensional data of the robotic arm and the three-dimensional data of the second fixed-point device. The three-dimensional data of the second pointing device determines the pose information of the second pointing device.
上述实施例中,步骤S101和S102的执行顺序不分先后,可以先执行步骤S101,再执行步骤S102,或者,先执行步骤S102,再执行步骤S101,或者,同时执行步骤S101以及步骤S102。步骤S103和步骤S104的执行顺序不分先后,可以先执行步骤S103,再执行步骤S104,或者,先执行步骤S103,再执行步骤S104,或者,同时执行步骤S103以及步骤S104。In the above embodiment, steps S101 and S102 are executed in no particular order. Step S101 may be executed first and then step S102, or step S102 may be executed first and then step S101, or step S101 and step S102 may be executed simultaneously. The execution order of step S103 and step S104 is in no particular order. Step S103 may be executed first and then step S104, or step S103 may be executed first and then step S104, or step S103 and step S104 may be executed simultaneously.
请参见图12,图12示出了本申请提供的一种隔空输入设备的结构示意图。如图12所示,隔空输入设备包括:Please refer to Figure 12, which shows a schematic structural diagram of an air-to-air input device provided by the present application. As shown in Figure 12, air-to-air input devices include:
起点确定单元310用于在显示区域中确定定点起点。The starting point determining unit 310 is used to determine a fixed point starting point in the display area.
轨迹确定单元320用于在第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的第一IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹,其中,所述第一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息。The trajectory determination unit 320 is configured to determine, according to the posture information of the first pointing device collected by the first visual sensor, the third point in the first pointing device during the three-dimensional pointing movement of the first pointing device in the air. The position and orientation information of the first pointing device collected by an IMU and the mapping relationship between the first position information and the first fixed point trajectory determine the fixed point trajectory of the first pointing device in the air, wherein, The first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the posture information of the second pointing device collected by the second IMU.
痕迹确定单元330用于根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹。The trace determining unit 330 is configured to determine the fixed point trace in the display area according to the fixed point starting point and the fixed point trajectory.
其中,起点确定单元310、轨迹确定单元320以及痕迹确定单元330协同工作,以实现上述S104中隔空输入设备执行的步骤。具体地,起点确定单元310用于执行上述S105中确定定点起点的步骤,轨迹确定单元320用于执行上述106中确定第一定点设备在空中的定点轨迹的步骤,痕迹确定单元330用于执行上述106中确定在所述显示区域的定点痕迹的步骤。Among them, the starting point determination unit 310, the trajectory determination unit 320, and the trace determination unit 330 work together to implement the steps performed by the air-to-air input device in S104. Specifically, the starting point determining unit 310 is used to perform the step of determining the fixed point starting point in S105, the trajectory determining unit 320 is used to perform the step of determining the fixed point trajectory of the first pointing device in the air in S106, and the trace determining unit 330 is used to perform The step of determining fixed-point traces in the display area in step 106 above.
可选地,隔空输入设备还可以包括训练单元(图未示),所述训练单元用于接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行三维定点运动得到的;接收所述第一位姿信息,通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。Optionally, the air-to-air input device may also include a training unit (not shown), the training unit being configured to receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is the The mechanical arm controls the second fixed-point device to perform three-dimensional fixed-point movement in the air; receives the first attitude information, and trains the neural network through the first attitude information and the first fixed-point trajectory to obtain the mapping relationship.
可选地,隔空输入设备还可以包括接收单元(图未示),用于接收第一视觉传感器采集的 第一定点设备的位姿信息(或者,第一视觉传感器采集的用户的三维数据以及第一定点设备的三维数据),第一IMU采集的所述第一定点设备的位姿信息。此外,还可以用于接收第二视觉传感器采集的第二定点设备的位姿信息(或者,第二视觉传感器采集的机械手臂的三维数据以及第二定点设备的三维数据),第二IMU采集的所述第二定点设备的位姿信息,机械手臂发送的第一定点轨迹等等。Optionally, the air-to-air input device may also include a receiving unit (not shown) for receiving the information collected by the first visual sensor. The pose information of the first pointing device (or the user's three-dimensional data collected by the first visual sensor and the three-dimensional data of the first pointing device), and the pose information of the first pointing device collected by the first IMU. In addition, it can also be used to receive the pose information of the second pointing device collected by the second visual sensor (or the three-dimensional data of the mechanical arm and the three-dimensional data of the second pointing device collected by the second visual sensor). The pose information of the second fixed-pointing device, the first fixed-point trajectory sent by the robot arm, etc.
可选地,隔空输入设备还可以包括显示单元(图未示),用于在显示区域显示定点痕迹。Optionally, the air-to-air input device may also include a display unit (not shown) for displaying fixed-point traces in the display area.
可选地,隔空输入设备可以包括定点痕迹确定设备以及投影仪,其中,起点确定单元310、轨迹确定单元320以及痕迹确定单元330设置在痕迹确定设备中,显示单元设置于投影仪中,此处不作具体限定。Optionally, the air-to-air input device may include a fixed-point trace determination device and a projector, wherein the starting point determination unit 310, the trajectory determination unit 320 and the trace determination unit 330 are provided in the trace determination device, and the display unit is provided in the projector, where There are no specific limitations.
另外,起点确定单元310确定的定点起点的位置、定点起点的确定方法参见图8中步骤S105中相关的介绍。轨迹确定单元320中用到的第一视觉传感器采集的所述第一定点设备的位姿信息、第一IMU采集的所述第一定点设备的位姿信息、第一位姿信息、第一定点轨迹的定义以及获取方法、映射关系的训练方法请参见图8中步骤S101、步骤S102以及步骤S105中的相关介绍,此处不再展开描述。In addition, for the position of the fixed point starting point determined by the starting point determining unit 310 and the method of determining the fixed point starting point, please refer to the relevant introduction in step S105 in FIG. 8 . The posture information of the first pointing device collected by the first visual sensor used in the trajectory determination unit 320, the posture information of the first pointing device collected by the first IMU, the first posture information, the third For the definition of a certain point trajectory, the acquisition method, and the training method of the mapping relationship, please refer to the relevant introduction in steps S101, S102 and S105 in Figure 8, and will not be described here.
请参见图13,图13示出了本申请提供的一种隔空输入设备的结构示意图。该隔空输入设备用于执行上述隔空输入方法中隔空输入设备执行的步骤。Please refer to Figure 13, which shows a schematic structural diagram of an air-to-air input device provided by the present application. The air-to-air input device is used to perform the steps performed by the air-to-air input device in the above-mentioned air to air input method.
如图13所示,隔空输入设备包括存储器410、处理器420、通信接口430以及总线440。其中,存储器410、处理器420、通信接口430通过总线440实现彼此之间的通信连接。As shown in FIG. 13 , the air-to-air input device includes a memory 410 , a processor 420 , a communication interface 430 and a bus 440 . Among them, the memory 410, the processor 420, and the communication interface 430 implement communication connections between each other through the bus 440.
存储器410可以是只读存储器(read only memory,ROM),静态存储设备、动态存储设备或者随机存取存储器(random access memory,RAM)。存储器410可以存储计算机指令,例如:起点确定单元310中的计算机指令、轨迹确定单元320中的计算机指令、痕迹确定单元330中的计算机指令等。当存储器410中存储的计算机指令被处理器420执行时,处理器420和通信接口430用于执行上述步骤S104至S106所述的部分或全部方法。存储器410还可以存储数据,例如:存储处理器420在执行过程中产生的中间数据或结果数据,例如,第一视觉传感器采集的用户的三维数据、第一定点设备的三维数据、用户的三维模型、第一定点设备的三维模型、映射关系、第一视觉传感器采集的位姿信息,第一IMU采集的位姿信息等。The memory 410 may be a read only memory (ROM), a static storage device, a dynamic storage device or a random access memory (RAM). The memory 410 may store computer instructions, such as: computer instructions in the starting point determination unit 310, computer instructions in the trajectory determination unit 320, computer instructions in the trace determination unit 330, etc. When the computer instructions stored in the memory 410 are executed by the processor 420, the processor 420 and the communication interface 430 are used to execute part or all of the method described in the above steps S104 to S106. The memory 410 can also store data, for example: intermediate data or result data generated by the processor 420 during execution, for example, the user's three-dimensional data collected by the first visual sensor, the three-dimensional data of the first pointing device, the user's three-dimensional data. Model, three-dimensional model of the first pointing device, mapping relationship, pose information collected by the first visual sensor, pose information collected by the first IMU, etc.
处理器420可以采用CPU、微处理器、专用集成电路(application specific integrated circuit,ASIC)、图形处理器(graphics processing unit,GPU)或者一个或多个集成电路。The processor 420 may be a CPU, a microprocessor, an application specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more integrated circuits.
处理器420还可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,隔空输入设备的部分或全部功能可以通过处理器420中的硬件的集成逻辑电路或者软件形式的指令完成。处理器420还可以是通用处理器、数据信号处理器(digital signal process,DSP)、现场可编程逻辑门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件,分立门或者晶体管逻辑器件,分立硬件组件,从而实现或者执行本申请实施例中公开的方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等,结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器、闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。 该存储介质位于存储器410,处理器420读取存储器410中的信息,结合其硬件完成上述隔空输入方法中步骤S105至步骤S108。The processor 420 may also be an integrated circuit chip with signal processing capabilities. During the implementation process, part or all of the functions of the air-to-air input device may be completed by instructions in the form of hardware integrated logic circuits or software in the processor 420 . The processor 420 may also be a general processor, a digital signal process (DSP), a field programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, Discrete hardware components are used to implement or execute the methods, steps and logical block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or the processor can be any conventional processor, etc. The steps of the method disclosed in conjunction with the embodiments of the present application can be directly implemented as a hardware decoding processor, or can be executed using decoding processing. The combination of hardware and software modules in the device is executed. The software module can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other mature storage media in this field. The storage medium is located in the memory 410. The processor 420 reads the information in the memory 410 and completes steps S105 to S108 in the above-mentioned air-to-air input method in conjunction with its hardware.
通信接口430使用例如但不限于收发器一类的收发模块,来实现计算设备与其他设备(例如,摄像设备、麦克风、服务器)之间的通信。The communication interface 430 uses a transceiver module such as, but not limited to, a transceiver to implement communication between the computing device and other devices (eg, camera device, microphone, server).
总线440可以包括在隔空输入设备中的各个部件(例如,存储器410、处理器420、通信接口430)之间传送信息的通路。Bus 440 may include a path for transmitting information between various components in the air-to-air input device (eg, memory 410, processor 420, communication interface 430).
在一些实施例中,隔空输入设备可以作为团队协作交流的终端设备。因此,可选的,隔空输入设备还可以包括摄像设备450、麦克风460,以用于实时采集图像信号、声音信号。或者,隔空输入设备还可以通过通信接口430与摄像设备450、麦克风460进行连接,以用于实时采集图像信号、声音信号。In some embodiments, the air-to-air input device can be used as a terminal device for team collaboration and communication. Therefore, optionally, the air-to-air input device may also include a camera device 450 and a microphone 460 for real-time collection of image signals and sound signals. Alternatively, the air-to-air input device can also be connected to the camera device 450 and the microphone 460 through the communication interface 430 for real-time collection of image signals and sound signals.
可以理解,定点痕迹确定设备的结构与图13所示的隔空输入设备相类似,但是,定点痕迹确定设备不需要执行上述隔空输入方法中在显示区域显示定点痕迹的步骤。It can be understood that the structure of the fixed-point trace determining device is similar to the air-to-air input device shown in Figure 13. However, the fixed-point trace determining device does not need to perform the step of displaying fixed-point traces in the display area in the above-mentioned air-to-air input method.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、存储盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态存储盘Solid State Disk(SSD))等。 In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated. The available media may be magnetic media (eg, floppy disk, storage disk, tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), etc.

Claims (11)

  1. 一种隔空输入方法,其特征在于,包括:An air-to-air input method, characterized by including:
    在显示区域中确定定点起点;Determine the starting point of the fixed point in the display area;
    第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的第一惯性测量单元IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹,其中,所述第一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息;During the three-dimensional fixed-point movement of the first pointing device in the air, the pose information of the first pointing device collected by the first visual sensor and the position and orientation information collected by the first inertial measurement unit IMU in the first pointing device are The position and attitude information of the first pointing device and the mapping relationship between the first position information and the first fixed point trajectory determine the fixed point trajectory of the first pointing device in the air, wherein the first attitude The information includes the pose information of the second pointing device collected by the second visual sensor, and the pose information of the second pointing device collected by the second IMU;
    根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹。According to the fixed point starting point and the fixed point trajectory, a fixed point trace in the display area is determined.
  2. 根据权利要求1所述的方法,其特征在于,所述第一视觉传感器采集的所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度,所述用户的三维模型和所述第一定点设备的三维模型是根据所述第一视觉传感器采集的所述用户的三维数据和所述第一定点设备的三维数据建立的。The method of claim 1, wherein the pose information of the first pointing device collected by the first visual sensor includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, And, the rotation angle of the three-dimensional model of the first pointing device relative to the wrist in the three-dimensional model of the user, the three-dimensional model of the user and the three-dimensional model of the first pointing device are determined according to the first visual sensor The collected three-dimensional data of the user and the three-dimensional data of the first pointing device are established.
  3. 根据权利要求2所述的方法,其特征在于,所述定点起点是所述用户的三维模型或所述第一定点设备的三维模型中的特定部位的法向量与所述显示区域的交点。The method according to claim 2, characterized in that the fixed point starting point is an intersection of a normal vector of a specific part in the user's three-dimensional model or the three-dimensional model of the first pointing device and the display area.
  4. 根据权利要求3所述的方法,其特征在于,所述定点起点是所述用户的三维模型中眼睛部位的法向量与所述显示区域的交点。The method according to claim 3, characterized in that the fixed point starting point is the intersection point of the normal vector of the eye part in the user's three-dimensional model and the display area.
  5. 根据权利要求1至4任一权利要求所述的方法,其特征在于,确定所述第一定点设备在空中的定点轨迹之前,所述方法还包括:The method according to any one of claims 1 to 4, characterized in that before determining the fixed-point trajectory of the first pointing device in the air, the method further includes:
    接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行定点运动得到的;Receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is obtained by the robotic arm controlling the second fixed-pointing device to perform fixed-point movement in the air;
    接收所述第二定点设备的第一位姿信息,其中,所述第一位姿信息是所述机械手臂控制所述第二定点设备在空中进行定点运动时的所述第二定点设备的位姿信息;Receive first attitude information of the second pointing device, wherein the first attitude information is the position of the second pointing device when the robot arm controls the second pointing device to perform a fixed point movement in the air. posture information;
    通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。The neural network is trained through the first posture information and the first fixed point trajectory to obtain the mapping relationship.
  6. 一种隔空输入设备,其特征在于,包括:处理器以及显示单元,所述处理器连接所述显示单元,An air-to-air input device, characterized by comprising: a processor and a display unit, the processor being connected to the display unit,
    所述处理器用于在所述显示单元产生的显示区域中确定定点起点,第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹,其中,所述第一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息;根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹;The processor is used to determine the starting point of a fixed point in the display area generated by the display unit. During the three-dimensional fixed point movement of the first pointing device in the air, the position and orientation of the first pointing device collected by the first visual sensor are Information, the posture information of the first pointing device collected by the IMU in the first pointing device, and the mapping relationship between the first posture information and the first pointing trajectory determine the first pointing device A fixed-point trajectory in the air, wherein the first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the posture information of the second pointing device collected by the second IMU; according to The fixed point starting point and the fixed point trajectory determine the fixed point trace in the display area;
    所述显示单元用于在所述显示区域中显示所述定点痕迹。The display unit is used to display the fixed point trace in the display area.
  7. 根据权利要求6所述的设备,其特征在于,所述设备还包括接收器,The device of claim 6, further comprising a receiver,
    所述接收器用于接收所述第一视觉传感器采集的所述第一定点设备的位姿信息,其中,所述第一视觉传感器采集的所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度,所述用户的三维模型和所述第一定点设备的三维模型是根据所述第一视觉传感器采集的所述用户的三维数据和所述第一定点设备的三维数据建立的。The receiver is configured to receive the pose information of the first pointing device collected by the first visual sensor, wherein the pose information of the first pointing device collected by the first visual sensor includes the The three-dimensional coordinates of the point in the three-dimensional model of the first pointing device, and the rotation angle of the wrist part in the three-dimensional model of the first pointing device relative to the user's three-dimensional model, the user's three-dimensional model and the third The three-dimensional model of the fixed-point device is established based on the three-dimensional data of the user collected by the first visual sensor and the three-dimensional data of the first fixed-point device.
  8. 根据权利要求7所述的设备,其特征在于, The device according to claim 7, characterized in that:
    所述接收器还用于接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行定点运动得到的;The receiver is also used to receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is obtained by the robotic arm controlling the second fixed-pointing device to perform fixed-point movement in the air;
    所述接收器还用于接收所述第二定点设备的第一位姿信息,其中,所述第一位姿信息是所述机械手臂控制所述第二定点设备在空中进行定点运动时所述第二定点设备的位姿信息;The receiver is further configured to receive first attitude information of the second pointing device, wherein the first attitude information is obtained when the robotic arm controls the second pointing device to perform fixed-point movement in the air. The pose information of the second pointing device;
    所述处理器还用于通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。The processor is also configured to train a neural network through the first attitude information and the first fixed point trajectory to obtain the mapping relationship.
  9. 一种隔空输入系统,其特征在于,包括:An air-to-air input system is characterized by including:
    第一定点设备,用于在空中进行三维定点运动,以及,通过所述第一定点设备中的第一IMU采集所述第一定点设备的位姿信息;A first pointing device, used for performing three-dimensional fixed-point movement in the air, and collecting the pose information of the first pointing device through the first IMU in the first pointing device;
    第一视觉传感器,用于采集所述第一定点设备的位姿信息;A first visual sensor used to collect position and orientation information of the first pointing device;
    隔空输入设备,用于在所述隔空输入设备产生的显示区域中确定定点起点,在第一定点设备在空中进行三维定点运动过程中,根据第一视觉传感器采集的所述第一定点设备的位姿信息、所述第一定点设备中的第一IMU采集的所述第一定点设备的位姿信息以及第一位姿信息与第一定点轨迹之间的映射关系确定所述第一定点设备在空中的定点轨迹,其中,所述第一位姿信息包括第二视觉传感器采集的第二定点设备的位姿信息,和,第二IMU采集的所述第二定点设备的位姿信息;根据所述定点起点以及所述定点轨迹,确定在所述显示区域的定点痕迹;An air-to-air input device is used to determine the starting point of a fixed point in the display area generated by the air-to-air input device. During the three-dimensional fixed-point movement of the first pointing device in the air, according to the first fixed point collected by the first visual sensor Determination of the mapping relationship between the posture information of the pointing device, the posture information of the first pointing device collected by the first IMU in the first pointing device, and the first posture information and the first pointing trajectory The fixed-point trajectory of the first pointing device in the air, wherein the first posture information includes the posture information of the second pointing device collected by the second visual sensor, and the second fixed point collected by the second IMU The pose information of the device; determining the fixed point trace in the display area according to the fixed point starting point and the fixed point trajectory;
    所述隔空输入设备,还用于所述显示区域中显示所述定点痕迹。The air-to-air input device is also used to display the fixed-point trace in the display area.
  10. 根据权利要求9所述的系统,其特征在于,所述第一定点设备的位姿信息包括所述第一定点设备的三维模型中的点的三维坐标,以及,所述第一定点设备的三维模型相对用户的三维模型中的手腕部位的旋转角度,所述用户的三维模型和所述第一定点设备的三维模型是根据所述第一视觉传感器采集的所述用户的三维数据和所述第一定点设备的三维数据建立的。The system of claim 9, wherein the pose information of the first pointing device includes three-dimensional coordinates of points in the three-dimensional model of the first pointing device, and the first pointing device The rotation angle of the wrist in the three-dimensional model of the device relative to the user's three-dimensional model. The three-dimensional model of the user and the three-dimensional model of the first pointing device are based on the three-dimensional data of the user collected by the first visual sensor. and established with the three-dimensional data of the first pointing device.
  11. 根据权利要求9或10所述的系统,其特征在于,The system according to claim 9 or 10, characterized in that,
    所述隔空输入设备,还用于接收机械手臂发送的所述第一定点轨迹,其中,所述第一定点轨迹是所述机械手臂控制所述第二定点设备在空中进行三维定点运动得到的;The air-to-air input device is also used to receive the first fixed-point trajectory sent by the robotic arm, wherein the first fixed-point trajectory is the three-dimensional fixed-point movement of the second fixed-point device in the air controlled by the robotic arm. owned;
    所述隔空输入设备,还用于接收所述第二定点设备的第一位姿信息,其中,所述第一位姿信息是所述机械手臂控制所述第二定点设备在空中进行定点运动时所述第二定点设备的位姿信息;The air-to-air input device is also used to receive the first attitude information of the second pointing device, wherein the first attitude information is the robot arm controlling the second pointing device to perform fixed-point movement in the air. The pose information of the second fixed-point device;
    所述隔开输入设备,还用于通过所述第一位姿信息和所述第一定点轨迹对神经网络进行训练,得到所述映射关系。 The separated input device is also used to train a neural network through the first posture information and the first fixed point trajectory to obtain the mapping relationship.
PCT/CN2023/077011 2022-03-25 2023-02-18 Air input method, device and system WO2023179264A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210302541.4A CN116841385A (en) 2022-03-25 2022-03-25 Space-apart input method, device and system
CN202210302541.4 2022-03-25

Publications (1)

Publication Number Publication Date
WO2023179264A1 true WO2023179264A1 (en) 2023-09-28

Family

ID=88099884

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/077011 WO2023179264A1 (en) 2022-03-25 2023-02-18 Air input method, device and system

Country Status (2)

Country Link
CN (1) CN116841385A (en)
WO (1) WO2023179264A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3678388B2 (en) * 1996-10-11 2005-08-03 株式会社リコー Pen-type input device and pattern recognition method for pen-type input device
US20120013578A1 (en) * 2010-07-15 2012-01-19 Avermedia Information, Inc. Pen-shaped pointing device and shift control method thereof
CN104007846A (en) * 2014-05-22 2014-08-27 深圳市宇恒互动科技开发有限公司 Three-dimensional figure generating method and electronic whiteboard system
JP2017027472A (en) * 2015-07-24 2017-02-02 株式会社リコー Coordinate input system, coordinate input device, coordinate input method, and program
CN113052078A (en) * 2021-03-25 2021-06-29 Oppo广东移动通信有限公司 Aerial writing track recognition method and device, storage medium and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3678388B2 (en) * 1996-10-11 2005-08-03 株式会社リコー Pen-type input device and pattern recognition method for pen-type input device
US20120013578A1 (en) * 2010-07-15 2012-01-19 Avermedia Information, Inc. Pen-shaped pointing device and shift control method thereof
CN104007846A (en) * 2014-05-22 2014-08-27 深圳市宇恒互动科技开发有限公司 Three-dimensional figure generating method and electronic whiteboard system
JP2017027472A (en) * 2015-07-24 2017-02-02 株式会社リコー Coordinate input system, coordinate input device, coordinate input method, and program
CN113052078A (en) * 2021-03-25 2021-06-29 Oppo广东移动通信有限公司 Aerial writing track recognition method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN116841385A (en) 2023-10-03

Similar Documents

Publication Publication Date Title
US11625841B2 (en) Localization and tracking method and platform, head-mounted display system, and computer-readable storage medium
US11321929B2 (en) System and method for spatially registering multiple augmented reality devices
KR20180075191A (en) Method and electronic device for controlling unmanned aerial vehicle
US20230037922A1 (en) Image display method and apparatus, computer device, and storage medium
CN111026314B (en) Method for controlling display device and portable device
WO2021004412A1 (en) Handheld input device, and method and apparatus for controlling display position of indication icon thereof
TWI453462B (en) Telescopic observation for virtual reality system and method thereof using intelligent electronic device
TW201820077A (en) Mobile devices and methods for determining orientation information thereof
US20240144617A1 (en) Methods and systems for anchoring objects in augmented or virtual reality
US20210263592A1 (en) Hand and totem input fusion for wearable systems
CN107193380A (en) A kind of low-cost and high-precision virtual reality positioning and interactive system
CN113784767A (en) Thermopile array fusion tracking
CN114332423A (en) Virtual reality handle tracking method, terminal and computer-readable storage medium
EP3627289A1 (en) Tracking system and tracking method using the same
WO2023179264A1 (en) Air input method, device and system
WO2019119999A1 (en) Method and apparatus for presenting expansion process of solid figure, and device and storage medium
WO2021190421A1 (en) Virtual reality-based controller light ball tracking method on and virtual reality device
US11200741B1 (en) Generating high fidelity spatial maps and pose evolutions
CN114373016A (en) Method for positioning implementation point in augmented reality technical scene
CN115686233A (en) Interaction method, device and interaction system for active pen and display equipment
KR20180106178A (en) Unmanned aerial vehicle, electronic device and control method thereof
JP7346977B2 (en) Control devices, electronic equipment, control systems, control methods, and programs
US20230089061A1 (en) Space recognition system, space recognition method, information terminal, and server apparatus
CN103034345B (en) Geographical virtual emulation 3D mouse pen in a kind of real space
Vincze et al. What-You-See-Is-What-You-Get Indoor Localization for Physical Human-Robot Interaction Experiments

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23773515

Country of ref document: EP

Kind code of ref document: A1