WO2020258106A1 - 手势识别的方法和设备、定位追踪的方法和设备 - Google Patents

手势识别的方法和设备、定位追踪的方法和设备 Download PDF

Info

Publication number
WO2020258106A1
WO2020258106A1 PCT/CN2019/093126 CN2019093126W WO2020258106A1 WO 2020258106 A1 WO2020258106 A1 WO 2020258106A1 CN 2019093126 W CN2019093126 W CN 2019093126W WO 2020258106 A1 WO2020258106 A1 WO 2020258106A1
Authority
WO
WIPO (PCT)
Prior art keywords
aoa
point
gesture
distance
hand
Prior art date
Application number
PCT/CN2019/093126
Other languages
English (en)
French (fr)
Inventor
刘建华
周安福
马华东
杨宁
张治�
唐海
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2019/093126 priority Critical patent/WO2020258106A1/zh
Priority to CN201980002838.5A priority patent/CN110741385B/zh
Publication of WO2020258106A1 publication Critical patent/WO2020258106A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Definitions

  • the embodiments of the present application relate to the field of human-computer interaction, and more specifically, to a method and device for gesture recognition, and a method and device for location tracking.
  • millimeter wave can greatly increase the speed of wireless networks.
  • millimeter waves can be applied to distance sensing and measurement.
  • the effect of distance sensing and measurement of small objects based on millimeter waves is poor. How to improve the distance sensing and measurement of small objects based on millimeter waves is an urgent problem to be solved.
  • the embodiments of the present application provide a method and device for gesture recognition, which can perform gesture recognition based on a gesture deconstruction method and a neural network customized for it, and can be widely used to recognize a large number of different gestures. And the embodiment of the present application provides a method and device for positioning and tracking, which can avoid or correct environmental interference, obtain accurate and interference-free positioning coordinates, and successfully complete real-time positioning and tracking.
  • a method for gesture recognition includes:
  • gesture information deconstruct the hand to obtain multiple discrete surface energy points
  • the hand gesture is recognized.
  • a location tracking method which includes:
  • a gesture recognition device including:
  • An acquiring unit configured to acquire gesture information collected by at least two radar sensors after the millimeter wave signal illuminates the user's hand and is reflected by the hand;
  • the processing unit is configured to deconstruct the hand according to the gesture information to obtain multiple discrete surface energy points;
  • the processing unit is further configured to recognize the hand gesture according to the movement trend of the multiple surface energy points.
  • a location tracking device including:
  • An acquiring unit configured to acquire a frame of mixed signals collected by at least two radar sensors after the millimeter wave signal illuminates the target object and is reflected by the target object;
  • a processing unit configured to determine frequency spectrum information according to the mixing signals collected by the at least two radar sensors
  • the processing unit is also used to detect the spectrum information to obtain multiple peak points;
  • the processing unit is further configured to perform denoising processing on the multiple peak points to determine the first peak point among the multiple peak points;
  • the processing unit is further configured to calculate the position coordinates of the target object on a rectangular coordinate system on a two-dimensional plane according to the distance from the target object to the at least two radar sensors at the first peak point and AoA.
  • a gesture recognition device including:
  • the device is configured to execute the method in the first aspect described above or any possible implementation manner thereof.
  • a location tracking device including:
  • the apparatus is configured to execute the method in the second aspect described above or any possible implementation manner thereof.
  • a gesture recognition system including:
  • Transmitter equipment for transmitting millimeter wave signals
  • At least two radar sensors configured to collect gesture information of millimeter wave signals that illuminate the user's hand and are reflected by the hand;
  • a device including a memory for storing programs and data and a processor for calling and running the programs and data stored in the memory, and the device is configured to execute the method in the first aspect or any possible implementation manner thereof .
  • a positioning tracking system including:
  • Transmitter equipment for transmitting millimeter wave signals
  • At least two radar sensors configured to collect a frame of mixed signals after the millimeter wave signal illuminates the target object and is reflected by the target object;
  • a device including a memory for storing programs and data and a processor for calling and running the programs and data stored in the memory, and the device is configured to execute the method in the second aspect or any possible implementation manner thereof .
  • a computer-readable storage medium for storing a computer program that enables a computer to execute any one of the above-mentioned first aspect to the second aspect or the method in each implementation manner thereof.
  • a computer program product including computer program instructions that cause a computer to execute any one of the above-mentioned first aspect to the second aspect or the method in each implementation manner thereof.
  • a computer program which when run on a computer, causes the computer to execute any one of the above-mentioned first to second aspects or the method in each of its implementation modes.
  • gesture recognition can be performed based on a gesture deconstruction method and a neural network customized for it, and it can be widely used to recognize a large number of different gestures.
  • FIG. 1 is a schematic flowchart of a method for gesture recognition provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram of millimeter wave signal transmission provided by an embodiment of the present application.
  • Fig. 3 is a schematic diagram of a transmitted wave and a received echo provided by an embodiment of the present application.
  • Fig. 4 is a schematic diagram of a single double-click according to an embodiment of the present application.
  • Fig. 5 is a schematic diagram of a neural network model provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of gesture recognition provided by an embodiment of the present application.
  • Fig. 7 is a schematic flowchart of a method for location tracking according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of another millimeter wave signal transmission provided by an embodiment of the present application.
  • Fig. 9 is a schematic structural diagram of a gesture recognition device according to an embodiment of the present application.
  • Fig. 10 is a schematic structural diagram of a location tracking device according to an embodiment of the present application.
  • Fig. 11 is a schematic structural diagram of a gesture recognition device according to an embodiment of the present application.
  • Fig. 12 is a schematic structural diagram of a location tracking device according to an embodiment of the present application.
  • Fig. 13 is a schematic structural diagram of a gesture recognition system according to an embodiment of the present application.
  • Fig. 14 is a schematic structural diagram of a positioning tracking system according to an embodiment of the present application.
  • millimeter wave can greatly increase the speed of wireless networks.
  • IEEE Institute of Electrical and Electronics Engineers
  • 802.11ad operating in the 60 GHz frequency band supports a data transmission rate of up to 6.7 Gbps, and its evolution standard IEEE 802.11ay will provide a data transmission rate of 20 Gbps. Therefore, millimeter wave radio is expected to enable wireless network access to enter the multi-Gbps (multi-Gbps) era.
  • multi-Gbps multi-Gbps
  • millimeter-wave radio modules will be widely installed on mobile phones, wearables, smart hardware, or more widely IoT devices, and become a mainstream communication technology.
  • millimeter wave perception also has its unique advantages, and can provide smarter, convenient, and interesting product experiences.
  • Millimeter wave sensing does not require a screen to recognize actions, and the recognition range is wider, hardly affected by light and heat radiation sources, and can measure the true distance. It can perform well in distance perception, gesture detection, proximity detection, people detection, distance measurement, presence detection, etc.
  • target object detection equipped with radio frequency identification (RFID) tags can already achieve centimeter-level positioning, and the technical means used is interferometric technology. It generates a phase hologram by measuring the relative phase of multiple RFID receivers. The potential position of the object can be mapped to the phase, and then the position tracking can be realized through the change of the phase. Because this solution must assemble the RFID tag on the detected object, the application scenario is relatively cumbersome and cannot meet the requirements of daily use.
  • RFID radio frequency identification
  • this application proposes a positioning and tracking solution.
  • We can accurately obtain the angle of arrival (AoA) and the angle of arrival from the target object to the chip.
  • Distance (range) after background noise processing and sudden abnormal noise point processing, the position of the target object on the two-dimensional plane can be calculated, and then positioning and tracking can be realized.
  • the Doppler effect means that the wavelength of the object radiation changes due to the relative movement of the light source and the observer.
  • the wave In front of the moving wave source, the wave is compressed, the wavelength becomes shorter and the frequency becomes higher, and behind the moving wave source, the opposite effect occurs, the wavelength becomes longer and the frequency becomes lower.
  • the higher the velocity of the wave source the greater the effect produced.
  • the velocity of the wave source moving in the observation direction can be calculated. That is, the above-mentioned Range-Doppler continuous thermal image can also be called a distance-speed continuous thermal image.
  • this application proposes a gesture recognition solution, which avoids directly inputting the original Range-Doppler continuous thermal image into the neural network, but after acquiring the Range-Doppler continuous thermal image , Deconstruct the hand to obtain multiple discrete surface energy points, and recognize the hand gestures according to the movement trend of the multiple surface energy points, so that gesture recognition for strange users can be realized.
  • the gesture recognition solution proposed in this application can provide a more concise, versatile, and user-acceptable user experience. Its biggest advantage is that it can have a high recognition accuracy without inputting the user's gesture information, and has a very high universality and application value. Coupled with the location tracking technology can realize the mouse simulation function, and further become a technology that can be commercialized and popularized.
  • the user can directly control the device by just watching the action demonstration once, instead of making the same gesture multiple times as training data and inputting it into the neural network for learning.
  • the user independently sets the ratio of the positioning and tracking target point mapped to the screen (for example, the actual movement is 5 cm, the effect on the screen is 1 cm), so it can be flexibly adapted to different devices such as mobile phones, tablets, and laptops
  • the single double-click detection function it can be an alternative to the physical mouse in some scenarios, that is, simulated mouse.
  • FIG. 1 is a schematic flowchart of a method 100 for gesture recognition according to an embodiment of the present application. It should be understood that FIG. 1 shows the steps or operations of the method 100, but these steps or operations are only examples, and the embodiment of the present application may also perform other operations or variations of each operation in FIG. 1.
  • the method 100 may be performed by a gesture recognition device, where the gesture recognition device may be a mobile phone, a tablet computer, a portable computer, a personal digital assistant (Personal Digital Assistant, PDA), etc., or the gesture recognition device may be A module or system in a mobile phone, a module or system in a tablet computer, a module or system in a portable computer, a module or system in a PDA, etc.
  • PDA Personal Digital Assistant
  • the method 100 for gesture recognition includes:
  • S110 Obtain gesture information collected by at least two radar sensors after the millimeter wave signal illuminates the user's hand and is reflected by the hand;
  • S120 Deconstruct the hand according to the gesture information to obtain multiple discrete surface energy points
  • a stranger can also implement non-contact gesture recognition based on the method 100 of gesture recognition. That is, when a new user makes gestures in front of the gesture recognition device to control the device, they can be accurately recognized without pre-recording their own gestures.
  • the method 100 for gesture recognition can be applied to projects that do not require gesture control permissions, such as taking pictures, controlling music players, editing photos, relaying radio stations, and so on.
  • a single double tap gesture can be recognized, or a large number of other gestures can be recognized.
  • the millimeter wave signal may be transmitted by the transmitting device (TX antenna).
  • TX antenna transmitting device
  • the millimeter wave signal illuminates the user's hand and is reflected by the hand, it is transmitted by the at least two radar sensors (RX Antenna), the signals collected by the at least two radar sensors are the gesture information.
  • the movement trend of the multiple surface energy points is reflected by at least one of the following frame sequences of M frames, where M is a positive integer:
  • Centripetal detection point number frame sequence centripetal average distance frame sequence, centripetal average velocity frame sequence, eccentric detection point number frame sequence, eccentric average distance frame sequence, eccentric average speed frame sequence, energy centroid detection point number frame sequence, energy centroid average distance Frame sequence, energy centroid average velocity frame sequence, angle value ⁇ frame sequence.
  • M 20.
  • the energy values of the multiple surface energy points obtained in step S120 above are greater than the first threshold value. That is, after deconstructing the hand, it is necessary to screen the surface energy points to screen out the multiple surface energy points that are greater than the first threshold value.
  • the hand is not a rigid body, but a soft body whose surface skin can bend and deform. Therefore, the hand will have both forward and backward movement in different gestures. Based on this, our hands are modeled with their movement trends to describe and record the concrete information of different gestures.
  • the movement trend of the multiple surface energy points can be obtained.
  • PRM pseudo representative model
  • the movement trend of the multiple surface energy points obtained by PRM processing has the characteristics of low dimensionality and simplicity, which provides convenience for the design of the neural network generalization model in the next step.
  • the movement trend of the multiple surface energy points can be obtained in the following manner:
  • the multiple surface energy points are classified according to the centripetal (CP) and centrifugal (CF) motion directions relative to the transmitting end of the millimeter wave signal, and the first surface energy point set (CP) and The second surface energy point collection (CF);
  • the first surface energy point set determine the number of centripetal detection points (Amount s ) frame sequence, the centripetal average distance (Range s ) frame sequence, and the centripetal average velocity (Velocity s ) frame sequence;
  • the multiple surface energy points determine the number of energy centroid detection points (Amount s ) frame sequence, energy centroid average distance (Range s ) frame sequence, and energy centroid average velocity (Velocity s ) frame sequence;
  • a frame sequence of angle values ⁇ is determined.
  • the first surface energy point set includes: SEP 1, SEP 2, SEP 3, SEP 4, SEP 5, where the distance of SEP 1 is Range 1, the speed of SEP 1 is Velocity 1, and the distance of SEP 2 is Range 2, SEP 2's speed is Velocity 2; SEP 3's distance is Range 3, SEP 3's speed is Velocity 3; SEP 4's distance is Range 4, SEP 4's speed is Velocity 4; SEP 5's distance is Range 5 , The speed of SEP 5 is Velocity 5.
  • centripetal detection point frame sequence is 5
  • centripetal average distance frame sequence is (Range 1+Range 2+Range 3+Range 4+Range 5)/5
  • centripetal average velocity frame sequence is (Velocity 1+Velocity 2+ Velocity 3+Velocity 4+Velocity 5)/5.
  • the second surface energy point set includes: SEP 6, SEP 7, and SEP 8, where the distance of SEP 6 is Range 6, the speed of SEP 6 is Velocity 6, and the distance of SEP 7 is Range 7, SEP 7.
  • the speed of SEP 8 is Velocity 7; the distance of SEP 8 is Range 8, and the speed of SEP 8 is Velocity 8.
  • the frame sequence of the number of centrifugal detection points is 3, the frame sequence of the centrifugal average distance is (Range 6+Range 7+Range 8)/3, and the frame sequence of the centrifugal average velocity is (Velocity 6+Velocity 7+Velocity 8)/3.
  • the multiple surface energy points include: SEP 1, SEP 2, SEP 3, SEP 4, SEP 5, SEP 6, SEP 7, SEP 8, where the distance of SEP 1 is Range 1, and the speed of SEP 1 is Velocity 1, the angle of SEP 1 is AoA 1; the distance of SEP 2 is Range 2, the velocity of SEP 2 is Velocity 2, the angle of SEP 2 is AoA 2; the distance of SEP 3 is Range 3, and the velocity of SEP 3 is Velocity 3.
  • the angle of SEP 3 is AoA 3; the distance of SEP 4 is Range 4, the speed of SEP 4 is Velocity 4, and the angle of SEP 4 is AoA 4; the distance of SEP 5 is Range 5, and the speed of SEP 5 is Velocity 5, SEP
  • the angle of 5 is AoA 5; the distance of SEP 6 is Range 6, the speed of SEP 6 is Velocity 6, and the angle of SEP 6 is AoA 6; the distance of SEP 7 is Range 7, and the speed of SEP 7 is Velocity 7, and the speed of SEP 7 is The angle is AoA 7; the distance of SEP 8 is Range 8, the speed of SEP 8 is Velocity 8, and the angle of SEP 8 is AoA 8.
  • the frame sequence of the number of energy centroid detection points is 8, and the frame sequence of the energy centroid average distance is (Range 1+Range 2+Range 3+Range 4+Range 5+Range 6+Range 7+Range 8)/8, and the energy centroid average velocity frame
  • the sequence is (Velocity 1+Velocity 2+Velocity 3+Velocity 4+Velocity 5+Velocity 6+Velocity 7+Velocity 8)/8, and the angle value ⁇ frame sequence is (AoA 1+AoA 2+AoA 3+AoA 4+AoA 5+AoA 6+AoA 7+AoA 8)/8.
  • the distance from the hand to the transmitting end at each surface energy point in the plurality of surface energy points, the distance between the hand and the transmitting end can be calculated according to the phase difference between the at least two radar sensors The AoA from the hand to the transmitting end, the speed of the hand relative to the transmitting end.
  • the transmitted wave is a high-frequency continuous wave, and its frequency changes with time according to the law of triangular waves.
  • the frequency of the echo received by the radar sensor is the same as the frequency of the emission. They are both triangular waves, but there is a time difference. Using this small time difference, the distance from the target object to the transmitting end can be calculated.
  • the AoA estimation from the hand to the transmitting end uses at least two RX antennas, as shown in Figure 2.
  • the distance difference between the hand and the two RX antennas will cause the phase change of the FFT peak, and the AoA is estimated through the phase change.
  • the gesture information deconstructed by the PRM has the characteristics of low dimensionality and simplicity, which facilitates the design of the neural network generalization model in the next step.
  • the above step S130 may specifically be: input the frame sequence of M frames reflecting the movement trend of the multiple surface energy points and the M frame constant calibration sequence into the neural network model, and identify the hand gesture.
  • the input of the neural network model is 10 time frame sequences and 1 constant calibration sequence.
  • the 10 time frame sequences are: centripetal detection point number frame sequence, centripetal average distance frame sequence, centripetal average velocity frame sequence, Eccentric detection point number frame sequence, eccentric average distance frame sequence, eccentric average velocity frame sequence, energy centroid detection point number frame sequence, energy centroid average distance frame sequence, energy centroid average velocity frame sequence, angle value ⁇ frame sequence.
  • Each sequence uses 20 frames in length, so the information of each gesture is a 20 ⁇ 11 one-dimensional matrix.
  • the neural network model is an equal amount neural network model.
  • the neural network model is an equal amount neural network model adapted to the above-mentioned PRM.
  • the neural network model 1000 includes at least two equal-learning (Equal-Learning, EL) modules 1010, and each of the at least two equal-learning modules 1010 sequentially moves from input to output. It includes a first convolutional layer 1111, a first normalization (Batch normalization) layer 1112, a linear rectification (Rectified Linear Unit, ReLU) activation function layer 1113, a second convolutional layer 1114, and a second normalization layer 1115.
  • Equal-Learning, EL equal-learning
  • the external and internal input and output size settings of each of the at least two equal learning modules are equal.
  • a convolutional layer 1020 with a core of 7 ⁇ 7 learns gesture information with a 64-dimensional specification of 14 ⁇ 7 and is connected in front of the at least two equal learning modules 1010 , And/or, at least two fully connected layers 1030 are connected after the at least two equal-amount learning modules 1010.
  • the convolutional layer with a core of 7 ⁇ 7 learns 64-dimensional gesture information with a specification of 14 ⁇ 7 before the at least two equal learning modules, which can ensure that the at least two equal learning modules have during the training process Enough parameters can be adjusted to enhance its learning ability.
  • the at least two fully connected layers are connected after the at least two equal learning modules to facilitate the final feature purification and classification of the neural network model.
  • the maximum pooling layer 1040 is connected in front of the at least two equal learning modules 1010. Therefore, some important values can be moved to the center of the picture to increase the learning ability of the neural network model.
  • the input of the neural network model 1000 is a 20 ⁇ 11 one-dimensional matrix 1050, and the one-dimensional matrix 1050 is information about a gesture.
  • step S120 may specifically be:
  • the high-pass filtering may be to subtract the average value of the gesture information data from the gesture information data and use the high-pass filtering data to eliminate all low frequencies.
  • step a, step b, step c, and step d are sequentially performed to obtain the gesture recognition result of the hand.
  • Step a Deconstruct the hand according to the gesture information to obtain multiple discrete surface energy points;
  • Step b Input the multiple surface energy points into the PRM for the hand movement to obtain the multiple surfaces The movement trend of energy points;
  • step c input the frame sequence of M frames reflecting the movement trend of the multiple surface energy points and the M frame constant calibration sequence into the neural network model;
  • step d the neural network model outputs the hand Gesture recognition result.
  • a non-target gesture library which includes large body or torso movements, small fingertip movements, and other hand movements with trajectories;
  • the target gesture includes a single-click gesture and/or a double-click gesture.
  • the first rule is:
  • Step 1 The probability of recognizing the hand gesture as the target gesture is greater than the first threshold
  • Step 2 The probability of recognizing that the hand gesture is a non-target gesture is less than a second threshold
  • step three the gesture classification results of both steps one and two are satisfied as valid recognition results.
  • the first threshold is 90%.
  • the second threshold is 15%.
  • the neural network model outputs that the probability of recognizing the hand gesture as a target gesture is 95%, and the probability of recognizing the hand gesture as a non-target gesture is 5%, based on the non-target gesture library and the first
  • the rule can determine that the gesture recognition result output by the neural network model is a valid recognition result, that is, the gesture is a target gesture.
  • the neural network model outputs the probability of recognizing the hand gesture as the target gesture is 75%, and the probability of recognizing the hand gesture as the non-target gesture is 25%, it is based on the non-target gesture library and the first gesture.
  • a rule can determine that the gesture recognition result output by the neural network model is an invalid recognition result, and discard the gesture recognition result.
  • the aforementioned non-target gesture library may also be established in advance or configured in advance.
  • judging the validity of the gesture result it is only necessary to perform: determine whether the hand gesture is a target gesture according to the non-target gesture library and the first rule.
  • the action of judging the validity of the gesture result can be performed by a screening module.
  • the gesture recognition based on the gesture deconstruction method and the neural network customized for it can be widely used to recognize a large number of different gestures.
  • gesture recognition based on single-click gestures and/or double-click gestures can realize mouse simulation, improve the practicability of the system, and bring the possibility of new operating modes for smart phones and tablets .
  • FIG. 7 is a schematic flowchart of a location tracking method 200 according to an embodiment of the present application. It should be understood that FIG. 7 shows the steps or operations of the method 200, but these steps or operations are only examples, and the embodiment of the present application may also perform other operations or variations of each operation in FIG. 7.
  • the method 200 can be executed by a location tracking device, where the location tracking device can be a mobile phone, a tablet computer, a portable computer, a PDA, etc., or the location tracking device can be a module or system in a mobile phone, or a tablet.
  • the location tracking method 200 includes:
  • S210 Obtain a frame of mixed signals collected by at least two radar sensors after the millimeter wave signal illuminates the target object and is reflected by the target object;
  • S220 Determine frequency spectrum information according to the mixing signals collected by the at least two radar sensors
  • S240 Perform noise removal processing on the multiple peak points to determine a first peak point among the multiple peak points
  • S250 Calculate a position coordinate of the target object on a rectangular coordinate system on a two-dimensional plane according to the distance from the target object to the at least two radar sensors at the first peak point and AoA.
  • the target object may be a small object, such as a hand, or a certain position or area of the hand. Since the target object is small, there will be relatively strong environmental interference. The embodiment of the present application avoids or corrects these environmental interferences, so that accurate and interference-free positioning coordinates can be obtained, and real-time positioning and tracking of the target object can be successfully completed.
  • the millimeter wave signal may be transmitted by the transmitting end device (TX antenna). After the millimeter wave signal illuminates the target object and is reflected by the target object, it is transmitted by the at least two radar sensors (RX antenna). ) Collection, the at least two radar sensors collect one frame of mixed signals each time.
  • TX antenna transmitting end device
  • RX antenna radar sensors
  • the method 200 before calculating the position coordinates of the target object on the rectangular coordinate system on the two-dimensional plane according to the distance at the first peak point and the AoA, that is, before the above step S250 , the method 200 further includes:
  • the distance at the first peak point and/or the AoA can correctly reflect the position coordinates of the target object
  • the distance at the first peak point and the AoA are used to calculate the position of the target object on a two-dimensional plane.
  • the absolute value of the difference between the distance at the first peak point and the distance at the first point is greater than the first threshold, or the difference between the AoA at the first peak point and the AoA at the first point is If the absolute value is greater than the second threshold, it is determined that the distance and/or the AoA at the first peak point cannot correctly reflect the position coordinates of the target object; and/or,
  • the absolute value of the difference between the distance at the first peak point and the distance at the first point is less than or equal to the first threshold, or the difference between the AoA at the first peak point and the AoA at the first point If the absolute value of the difference is less than or equal to the second threshold, it is determined that the distance at the first peak point and/or the AoA can correctly reflect the position coordinates of the target object;
  • the first point is the peak point that can correctly reflect the position coordinates of the target object last time.
  • first point can also be referred to as the last selected point (lastPoint)
  • the first threshold is 0.1 m.
  • the second threshold is 20 degrees.
  • first threshold and the second threshold can be flexibly set according to actual conditions.
  • some sudden abnormal noises such as sudden arrival of highly reflective objects, can be filtered.
  • step S240 may specifically be:
  • the peak point closest to the first point among the multiple peak points is determined as the first peak point, and the first point is the last peak point that can correctly reflect the position coordinates of the target object.
  • the first point is the last peak point that can correctly reflect the position coordinates of the target object, considering the moving speed of the target object and the time interval between each frame, the distance between the multiple peak points and the first The nearest peak point is determined as the first peak point to exclude noise peak points.
  • the first point of initialization may be determined according to the mixing signal of the previous K frames, and K is a positive integer. That is, the first point is initialized.
  • K is 5, 10, 15, or 20.
  • the initial position of the first point needs to be determined, that is, the value of lastPoint needs to be initialized.
  • the embodiment of the present application adopts the method of releasing K frames before the K frames, that is, no threshold filtering is performed before the K frames, so as to find a usable initial point. For the data of the first K frames, choose to abandon it. Since K frames are only a short moment, the user will not feel any delay during initialization.
  • the AoA at the first peak point may be smoothed.
  • the AoA at the first peak point, the AoA at the first point, and the AoA at the second point are averaged to smooth the jitter of the AoA at the first peak point.
  • One point is the peak point that can correctly reflect the position coordinates of the target object last time
  • the second point is the peak point that can correctly reflect the position coordinates of the target object last time.
  • the obtained AoA value will have a certain amount of jitter.
  • the embodiment of the present application adopts the method of averaging the last three AoA, which smoothes the jitter phenomenon very well and obtains excellent Position the track.
  • the distance between the target object and the at least two radar sensors at each peak point of the plurality of peak points may be calculated according to the phase difference between the at least two radar sensors.
  • AoA or, according to the phase difference between the at least two radar sensors, calculate the distance and AoA from the target object at the first peak point to the at least two radar sensors.
  • the transmitted wave is a high-frequency continuous wave whose frequency changes with time according to the law of triangular waves.
  • the frequency of the echo received by the radar sensor is the same as the frequency of the emission. They are both triangular waves, but there is a time difference. Using this small time difference, the distance from the target object to the transmitting end can be calculated.
  • the AoA estimation from the target object to the transmitting end uses at least two RX antennas, as shown in Figure 8.
  • the distance difference between the target object and the two RX antennas will cause the phase change of the FFT peak, and the AoA is estimated through the phase change.
  • step S230 may specifically include:
  • the peak values are used to determine the distance of the object and the signal strength. Due to the extremely high resolution of millimeter waves and the interference of environmental objects, multiple detection points will be formed in the detection area, which manifests as multiple peak points.
  • the peak point is the location where the target object exists. noise.
  • step S220 may specifically include:
  • High-pass filtering and FFT processing are performed on the mixing signals collected by the at least two radar sensors to obtain the spectrum information.
  • the high-pass filtering may be to subtract the average value of the gesture information data from the gesture information data and use the high-pass filtering data to eliminate all low frequencies.
  • an embodiment of the present application provides a gesture recognition device 300, and the device 300 includes:
  • the obtaining unit 310 is configured to obtain gesture information collected by at least two radar sensors after the millimeter wave signal illuminates the user's hand and is reflected by the hand;
  • the processing unit 320 is configured to deconstruct the hand according to the gesture information to obtain multiple discrete surface energy points;
  • the processing unit 320 is further configured to recognize the hand gesture according to the movement trend of the multiple surface energy points.
  • the movement trends of the multiple surface energy points are reflected by at least one of the following frame sequences of M frames, where M is a positive integer:
  • Centripetal detection point number frame sequence centripetal average distance frame sequence, centripetal average velocity frame sequence, eccentric detection point number frame sequence, eccentric average distance frame sequence, eccentric average speed frame sequence, energy centroid detection point number frame sequence, energy centroid average distance Frame sequence, energy centroid average velocity frame sequence, angle value ⁇ frame sequence.
  • M 20.
  • processing unit 320 is specifically configured to:
  • the M frame sequence reflecting the movement trend of the multiple surface energy points and the M frame constant calibration sequence are input into the neural network model to recognize the hand gesture.
  • the neural network model is an equal amount neural network model.
  • the neural network model includes at least two equal learning modules.
  • Each equal learning module in the at least two equal learning modules includes a first convolutional layer, a first normalization layer, and a linear rectification activation function from input to output.
  • the external and internal input and output size settings of each of the at least two equal learning modules are equal.
  • a convolutional layer with a core of 7 ⁇ 7 learns gesture information with a 64-dimensional specification of 14 ⁇ 7 and is connected in front of the at least two equal learning modules, and/or, at least two layers of full
  • the connection layer is connected after the at least two equal learning modules.
  • the maximum pooling layer is connected before the at least two equal learning modules.
  • the processing unit 320 is further configured to input the multiple surface energy points into a pseudo-recreation model for the hand movement to obtain the movement trend of the multiple surface energy points.
  • processing unit 320 is specifically configured to:
  • the second surface energy point set determine a frame sequence of centrifugal detection points, a frame sequence of centrifugal average distance, and a frame sequence of centrifugal average speed;
  • the frame sequence of energy centroid detection points determine the frame sequence of energy centroid average distance, and the frame sequence of energy centroid average velocity
  • a frame sequence of angle values ⁇ is determined.
  • processing unit 320 is also used for
  • processing unit 320 is specifically configured to:
  • the phase difference between the at least two radar sensors calculate the distance from the hand to the transmitting end, the AoA from the hand to the transmitting end, and the relative position of the hand at each surface energy point in the plurality of surface energy points.
  • the speed of the transmitter calculates the distance from the hand to the transmitting end, the AoA from the hand to the transmitting end, and the relative position of the hand at each surface energy point in the plurality of surface energy points.
  • processing unit 320 is specifically configured to:
  • the hand is deconstructed to obtain the discrete surface energy points.
  • the energy values of the multiple surface energy points are greater than the first threshold value.
  • processing unit 320 is further configured to:
  • a non-target gesture library which includes large body or torso movements, small fingertip movements, and other hand movements with trajectories;
  • the target gesture includes a single-click gesture and/or a double-click gesture.
  • the first rule is:
  • Step 1 The probability of recognizing the hand gesture as the target gesture is greater than the first threshold
  • Step 2 The probability of recognizing that the hand gesture is a non-target gesture is less than a second threshold
  • step three the gesture classification results of both steps one and two are satisfied as valid recognition results.
  • the first threshold is 90%.
  • the second threshold is 15%.
  • gesture recognition device 300 may correspond to the method embodiment of the present application, and the above and other operations and/or functions of each unit in the gesture recognition device 300 are to implement the method shown in FIG. 1 respectively.
  • the corresponding process in 100 will not be repeated here.
  • an embodiment of the present application provides a location tracking device 400, and the device 400 includes:
  • the obtaining unit 410 is configured to obtain a frame of mixed signals collected by at least two radar sensors after the millimeter wave signal illuminates the target object and is reflected by the target object;
  • the processing unit 420 is configured to determine frequency spectrum information according to the mixing signals collected by the at least two radar sensors;
  • the processing unit 420 is also used to detect the spectrum information to obtain multiple peak points;
  • the processing unit 420 is further configured to perform denoising processing on the multiple peak points to determine a first peak point among the multiple peak points;
  • the processing unit 420 is further configured to calculate the position coordinates of the target object in a rectangular coordinate system on a two-dimensional plane according to the distance from the target object to the at least two radar sensors at the first peak point and AoA.
  • the processing unit 420 calculates the position coordinates of the target object in a rectangular coordinate system on a two-dimensional plane according to the distance at the first peak point and the AoA, the processing unit 420 is further configured to:
  • the distance at the first peak point and/or the AoA can correctly reflect the position coordinates of the target object
  • the distance at the first peak point and the AoA are used to calculate the position of the target object on a two-dimensional plane.
  • processing unit 420 is specifically configured to:
  • the absolute value of the difference between the distance at the first peak point and the distance at the first point is greater than the first threshold, or the difference between the AoA at the first peak point and the AoA at the first point is If the absolute value is greater than the second threshold, then it is determined that the distance and/or the AoA at the first peak point cannot correctly reflect the position coordinates of the target object; and/or,
  • the absolute value of the difference between the distance at the first peak point and the distance at the first point is less than or equal to the first threshold, or the difference between the AoA at the first peak point and the AoA at the first point If the absolute value of the difference is less than or equal to the second threshold, it is determined that the distance at the first peak point and/or the AoA can correctly reflect the position coordinates of the target object;
  • the first point is the peak point that can correctly reflect the position coordinates of the target object last time.
  • the first threshold is 0.1 m.
  • the second threshold is 20 degrees.
  • processing unit 420 is specifically configured to:
  • the peak point closest to the first point among the multiple peak points is determined as the first peak point, and the first point is the last peak point that can correctly reflect the position coordinates of the target object.
  • the processing unit 420 is further configured to determine the first point of initialization according to the mixing signal of the previous K frames, and K is a positive integer.
  • the processing unit 420 is further configured to perform smoothing processing on the AoA at the first peak point.
  • processing unit 420 is further configured to:
  • the AoA at the first peak point, the AoA at the first point, and the AoA at the second point are averaged to smooth the jitter of the AoA at the first peak point, where the first point is The peak point that can correctly reflect the position coordinates of the target object last time, and the second point is the peak point that can correctly reflect the position coordinates of the target object last time.
  • processing unit 420 is further configured to:
  • phase difference between the at least two radar sensors calculate the distance and AoA from the target object at the first peak point to the at least two radar sensors.
  • processing unit 420 is specifically configured to:
  • processing unit 420 is specifically configured to:
  • High-pass filtering and FFT processing are performed on the mixing signals collected by the at least two radar sensors to obtain the spectrum information.
  • location tracking device 400 may correspond to the method embodiment of the present application, and the above and other operations and/or functions of each unit in the location tracking device 400 are to implement the method shown in FIG. 7 respectively.
  • the corresponding process in 200 will not be repeated here.
  • an embodiment of the present application provides a gesture recognition device 500, and the gesture recognition device 500 includes:
  • the memory 510 is used to store programs and data
  • the processor 520 is configured to call and run the programs and data stored in the memory;
  • the apparatus 500 is configured to perform the methods shown in FIGS. 1 to 6 described above.
  • an embodiment of the present application provides a location tracking device 600, and the location tracking device 600 includes:
  • the memory 610 is used to store programs and data
  • the processor 620 is configured to call and run the programs and data stored in the memory;
  • the apparatus 600 is configured to perform the methods shown in FIGS. 7 to 8 described above.
  • an embodiment of the present application provides a gesture recognition system 700, including:
  • the transmitting end device 710 is used to transmit millimeter wave signals
  • At least two radar sensors 720 configured to collect gesture information of millimeter wave signals that illuminate the user's hand and are reflected by the hand;
  • the device 730 includes a memory 731 for storing programs and data, and a processor 732 for calling and running the programs and data stored in the memory.
  • the device 730 is configured to execute the above-mentioned FIGS. 1 to 6 method.
  • an embodiment of the present application provides a location tracking system 800, including:
  • the transmitting end device 810 is used to transmit millimeter wave signals
  • At least two radar sensors 820 configured to collect a frame of mixed signals after the millimeter wave signal illuminates the target object and is reflected by the target object;
  • the device 830 includes a memory 831 for storing programs and data, and a processor 832 for calling and running the programs and data stored in the memory.
  • the device 830 is configured to execute the above-mentioned FIGS. 7 to 8 method.
  • the processor of the embodiment of the present application may be an integrated circuit chip with signal processing capability.
  • the steps of the foregoing method embodiments can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the aforementioned processor may be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC application specific integrated circuit
  • FPGA ready-made programmable gate array
  • the methods, steps, and logical block diagrams disclosed in the embodiments of the present application can be implemented or executed.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory in the embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), and electrically available Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be a random access memory (Random Access Memory, RAM), which is used as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDR SDRAM
  • ESDRAM enhanced synchronous dynamic random access memory
  • Synchlink DRAM SLDRAM
  • DR RAM Direct Rambus RAM
  • the memory in the embodiment of the present application may also be static random access memory (static RAM, SRAM), dynamic random access memory (dynamic RAM, DRAM), Synchronous dynamic random access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous connection Dynamic random access memory (synch link DRAM, SLDRAM) and direct memory bus random access memory (Direct Rambus RAM, DR RAM), etc. That is to say, the memory in the embodiment of the present application is intended to include but not limited to these and any other suitable types of memory.
  • the embodiment of the present application also provides a computer-readable storage medium for storing computer programs.
  • the computer-readable storage medium can be applied to the gesture recognition device in the embodiment of the present application, and the computer program causes the computer to execute the corresponding process implemented by the gesture recognition device in each method in the embodiment of the present application, in order to It's concise, so I won't repeat it here.
  • the computer-readable storage medium can be applied to the location tracking device in the embodiment of the present application, and the computer program causes the computer to execute the corresponding process implemented by the location tracking device in each method of the embodiment of the present application, for It's concise, so I won't repeat it here.
  • the embodiments of the present application also provide a computer program product, including computer program instructions.
  • the computer program product can be applied to the gesture recognition device in the embodiment of the present application, and the computer program instructions cause the computer to execute the corresponding process implemented by the gesture recognition device in each method of the embodiment of the present application, for the sake of brevity , I won’t repeat it here.
  • the computer program product can be applied to the location tracking device in the embodiment of the present application, and the computer program instructions cause the computer to execute the corresponding process implemented by the location tracking device in the various methods of the embodiment of the present application, for the sake of brevity , I won’t repeat it here.
  • the embodiment of the present application also provides a computer program.
  • the computer program can be applied to the gesture recognition device in the embodiment of the present application.
  • the computer program runs on the computer, the computer can execute the corresponding methods implemented by the gesture recognition device in the various methods of the embodiments of the present application. For the sake of brevity, the process will not be repeated here.
  • the computer program can be applied to the location tracking device in the embodiment of the present application.
  • the computer program runs on the computer, the computer is caused to execute the corresponding methods implemented by the location tracking device in the various embodiments of the present application. For the sake of brevity, the process will not be repeated here.
  • the size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic, rather than corresponding to the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the unit is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program code .

Abstract

一种手势识别的方法和设备,该方法包括:获取毫米波信号照射用户手部且经该手部反射后被至少两个雷达传感器采集的手势信息(S110);根据该手势信息,对该手部进行解构,得到离散的多个表面能量点(S120);根据该多个表面能量点的运动趋势,识别该手部的手势(S130)。该方法可以基于手势解构方法和为之定制的神经网络来进行手势识别,更可以被广泛应用于识别大量不同的手势。以及一种定位追踪的方法和设备,能够避免或修正环境干扰,获得准确无干扰的定位坐标,顺利完成实时的定位追踪。

Description

手势识别的方法和设备、定位追踪的方法和设备 技术领域
本申请实施例涉及人机交互领域,并且更具体地,涉及手势识别的方法和设备、定位追踪的方法和设备。
背景技术
毫米波作为下一代无线通信技术,可大幅度提高无线网络速率。同时毫米波可以应用于距离感知和测量。然而,在基于毫米波对小型物体进行距离感知和测量的效果较差,如何基于毫米波提高对小型物体进行距离感知和测量是一个亟待解决的问题。
发明内容
本申请实施例提供了一种手势识别的方法和设备,可以基于手势解构方法和为之定制的神经网络来进行手势识别,更可以被广泛应用于识别大量不同的手势。以及本申请实施例提供了一种定位追踪的方法和设备,能够避免或修正环境干扰,获得准确无干扰的定位坐标,顺利完成实时的定位追踪。
第一方面,提供了一种手势识别的方法,该方法包括:
获取毫米波信号照射用户手部且经该手部反射后被至少两个雷达传感器采集的手势信息;
根据该手势信息,对该手部进行解构,得到离散的多个表面能量点;
根据该多个表面能量点的运动趋势,识别该手部的手势。
第二方面,提供了一种定位追踪的方法,该方法包括:
获取毫米波信号照射目标物体且经该目标物体反射后被至少两个雷达传感器采集的一帧混频信号;
根据该至少两个雷达传感器所采集的混频信号确定频谱信息;
检测该频谱信息,得到多个峰值点;
对该多个峰值点进行去噪声处理,以在该多个峰值点中确定第一峰值点;
根据该第一峰值点处该目标物体到该至少两个雷达传感器的距离和AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标。
第三方面,提供了一种手势识别的设备,包括:
获取单元,用于获取毫米波信号照射用户手部且经该手部反射后被至少两个雷达传感器采集的手势信息;
处理单元,用于根据该手势信息,对该手部进行解构,得到离散的多个表面能量点;
所述处理单元,还用于根据该多个表面能量点的运动趋势,识别该手部的手势。
第四方面,提供了一种定位追踪的设备,包括:
获取单元,用于获取毫米波信号照射目标物体且经该目标物体反射后被至少两个雷达传感器采集的一帧混频信号;
处理单元,用于根据该至少两个雷达传感器所采集的混频信号确定频谱信息;
该处理单元,还用于检测该频谱信息,得到多个峰值点;
该处理单元,还用于对该多个峰值点进行去噪声处理,以在该多个峰值点中确定第一峰值点;
该处理单元,还用于根据该第一峰值点处该目标物体到该至少两个雷达传感器的距离和AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标。
第五方面,提供了一种手势识别的装置,包括:
存储器,用于存储程序和数据;以及
处理器,用于调用并运行所述存储器中存储的程序和数据;
该装置被配置为执行上述第一方面或其任意可能的实现方式中的方法。
第六方面,提供了一种定位追踪的装置,包括:
存储器,用于存储程序和数据;以及
处理器,用于调用并运行所述存储器中存储的程序和数据;
该装置被配置为执行上述第二方面或其任意可能的实现方式中的方法。
第七方面,提供了一种手势识别的系统,包括:
发射端设备,用于发射毫米波信号;
至少两个雷达传感器,用于采集毫米波信号照射用户手部且经所述手部反射后的手势信息;
装置,包括用于存储程序和数据的存储器和用于调用并运行所述存储器中存储的程序和数据的处理器,该装置被配置为执行上述第一方面或其任意可能的实现方式中的方法。
第八方面,提供了一种定位追踪的系统,包括:
发射端设备,用于发射毫米波信号;
至少两个雷达传感器,用于采集毫米波信号照射目标物体且经所述目标物体反射后的一帧混频信号;
装置,包括用于存储程序和数据的存储器和用于调用并运行所述存储器中存储的程序和数据的处理器,该装置被配置为执行上述第二方面或其任意可能的实现方式中的方法。
第九方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十方面,提供了一种计算机程序产品,包括计算机程序指令,所述计算机程序指令使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
第十一方面,提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面至第二方面中的任一方面或其各实现方式中的方法。
通过上述手势识别的技术方案,可以基于手势解构方法和为之定制的神经网络来进行手势识别,更可以被广泛应用于识别大量不同的手势。
通过上述定位追踪的技术方案,能够避免或修正环境干扰,获得准确无干扰的定位坐标,顺利完成实时的定位追踪。
附图说明
图1是本申请实施例提供的一种手势识别的方法的示意性流程图。
图2是本申请实施例提供的一种毫米波信号传输示意图。
图3是本申请实施例提供的一种发射波和接收的回波的示意图。
图4是本申请实施例提供的一种单双击的示意图。
图5是本申请实施例提供的一种神经网络模型的示意图。
图6是本申请实施例提供的一种手势识别的示意图。
图7是根据本申请实施例提供的一种定位追踪的方法的示意性流程图。
图8是本申请实施例提供的另一种毫米波信号传输示意图。
图9是根据本申请实施例的一种手势识别的设备的示意性结构图。
图10是根据本申请实施例的一种定位追踪的设备的示意性结构图。
图11是根据本申请实施例的一种手势识别的装置的示意性结构图。
图12是根据本申请实施例的一种定位追踪的装置的示意性结构图。
图13是根据本申请实施例的一种手势识别的系统的示意性结构图。
图14是根据本申请实施例的一种定位追踪的系统的示意性结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。针对本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
毫米波作为下一代无线通信技术,可大幅度提高无线网络速率。例如,运行在60GHz频段的电气和电子工程师协会(Institute of Electrical and Electronics Engineers,IEEE)802.11ad支持高达6.7Gbps的数据传输速率,而其演进标准IEEE 802.11ay将提供20Gbps的数据传输速率。因此,毫米波无线电有望使无线网络接入进入到多Gbps(multi-Gbps)时代。在可预期的未来,毫米波无线电模块将会被广泛安装在手机、可穿戴、智能硬件或更广泛的物联网设备上,成为一种主流的通讯技术。
同时,毫米波感知也具有其独特的优势,并能够提供更智能、便捷、有趣产品体验。毫米波感知可以不需要屏幕来识别动作,而且识别范围更广,几乎不受光线和热辐射源等的影响,并可测算真实距离。其在距离感知、手势检测、接近检测、人数探测、距离测量、存在性检测等方面均可以有不错的表现。
目前,装配有无线射频识别(Radio Frequency Identification,RFID)标签的目标物体检测已经可以达到厘米级的定位,其使用的技术手段为干涉测量技术。它通过测量多个RFID接收器的相对相位,生成相位全息图,物体的潜在位置可以映射到相位上,进而通过相位的变化来实现定位追踪。由于该方案必须将RFID标签装配在被检测物体上,因此应用场景较为繁琐,无法胜任日常使用的要求。
基于上述定位追踪中存在的技术问题,本申请提出了一种定位追踪的方案,根据毫米波距离感知准确的特点,我们可以准确的得到目标物体到芯片的到达角(angle of arrival,AoA)和距离(range),在背景噪声处理和突发异常噪声点处理后,可以计算出目标物体在二维平面上的位置,进而实现定位 追踪。
目前存在一种运用微型雷达监测空中手势动作的新型传感技术,采用毫米波技术,直接将原始的距离-多普勒(Range-Doppler)连续热度图像输入至神经网络中,可以追踪亚毫米精准度的高速运动,以实现手势识别。但是这种技术还并不具备准确识别来自陌生人的手势的能力,同时,会遇到很多噪声处理方面的挑战,导致追踪到的物体轨迹呈现剧烈抖动的现象。
需要说明的是,多普勒效应是指物体辐射的波长因为光源和观测者的相对运动而产生变化。在运动的波源前面,波被压缩,波长变得较短,频率变得较高,在运动的波源后面,产生相反的效应,波长变得较长,频率变得较低。波源的速度越高,所产生的效应越大,根据光波红/蓝移的程度,可以计算出波源循着观测方向运动的速度。即上述Range-Doppler连续热度图像也可以称之为距离-速度连续热度图像。
需要说明的是,在手势识别方面我们发现,每当人们在做同一个手势的时候,手部的运动趋势都有着自己的共性,所以这就让毫米波识别技术识别不同用户的同一个动作成为可能,尤其是识别设备从未见过的用户的手势。其技术原理是分析从手部反弹回来的毫米波信号之间的差异来实现寻找单双击动作的特点,再利用我们为此设计的神经网络来提取和学习单双击手势的细节共性,以此进行对陌生用户的单双击动作的识别。
基于上述手势识别中存在的技术问题,本申请提出了一种手势识别的方案,避免直接将原始的Range-Doppler连续热度图像输入至神经网络中,而是,在获取Range-Doppler连续热度图像之后,对手部进行解构,得到离散的多个表面能量点,以及根据该多个表面能量点的运动趋势,识别该手部的手势,从而,可以实现对陌生用户的手势识别。
利用本申请提出的手势识别的方案,能够提供更加简洁、有通用性、易于被用户接受的使用体验。它最大的优势就是可以不需要输入用户的手势信息就可以拥有很高的识别准确率,有着非常高的普适性和应用价值。再加上定位追踪技术可以实现鼠标模拟功能,进一步成为可被商用和普及的一项技术。
在初次接触本申请提出的手势识别的方案的时候,用户可仅看一遍动作演示就能直接完成对设备的控制,而不需要多次做同样的手势作为训练数据输入到神经网络中进行学习。在定位方面,用户自主设定定位追踪目标点映射到屏幕上的比例(如实际移动5厘米,屏幕上效果为1厘米),因此可以灵活适配到手机、平板电脑、笔记本电脑等不同的设备上,配合单双击检测功能在某些场景下可以成为实体鼠标的替代选择,即模拟鼠标。
图1是本申请一个实施例的手势识别的方法100的示意性流程图。应理解,图1示出了该方法100的步骤或操作,但这些步骤或操作仅是示例,本申请实施例还可以执行其他操作或者图1中的各个操作的变形。该方法100可以由手势识别的装置执行,其中,该手势识别的装置可以是手机、平板电脑、便携式电脑、个人数字助理(Personal Digital Assistant,PDA)等等,或者,该手势识别的装置可以是手机中的一个模块或者系统、平板电脑中的一个模块或者系统、便携式电脑中的一个模块或者系统、PDA中的一个模块或者系统等等。
具体地,该手势识别的方法100包括:
S110,获取毫米波信号照射用户手部且经该手部反射后被至少两个雷达传感器采集的手势信息;
S120,根据该手势信息,对该手部进行解构,得到离散的多个表面能量点;
S130,根据该多个表面能量点的运动趋势,识别该手部的手势。
需要说明的是,在本申请实施例中,陌生人也可以基于该手势识别的方法100实现非接触式的手势识别。即新的用户在手势识别的装置面前做手势来控制设备时,可以不经预先录制自己的手势就可以被准确识别。该手势识别的方法100可以应用于不需要手势控制权限的项目,例如拍照、控制音乐播放器、修照片、转播电台等等。
可选地,在上述步骤S130中,可以识别单双击手势,也可以识别大量的一些其他手势。
可选地,如图2所示,该毫米波信号可以是由发射端设备(TX天线)发射,毫米波信号照射用户手部且经该手部反射后,被该至少两个雷达传感器(RX天线)采集,该至少两个雷达传感器所采集的信号即为该手势信息。
可选地,在本申请实施例中,该多个表面能量点(Surface Energy Points,SEPs)的运动趋势通过M帧以下帧序列中的至少一种反映,M为正整数:
向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列、离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列、能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列、角度值α帧序列。
例如,M=20。
需要说明的是,该多个表面能量点的运动趋势还可以通过一些其他的信息反映,本申请对此不作 限定。
可选地,为了避免环境微小扰动对手势识别的影响,上述步骤S120得到的该多个表面能量点的能量值大于第一门限值。即在对该手部进行解构之后,需要对表面能量点进行筛选,以筛选出大于第一门限值的该多个表面能量点。
手不是一个刚体,而是一个表面皮肤可以弯曲形变的柔体,因此在不同的手势中手会同时拥有向前运动和向后运动的部分。据此,我们对手部以其运动趋势建模,以此来描绘和记录不同手势的具象信息。
可选地,可以通过将该多个表面能量点输入针对该手部运动的伪具象化模型(Pseudo Representative Model,PRM),得到该多个表面能量点的运动趋势。
经由PRM处理得到的该多个表面能量点的运动趋势具有低维度与简洁的特性,为后一步的神经网络泛化模型的设计提供了便利。
具体地,可以通过如下方式,得到该多个表面能量点的运动趋势:
将该多个表面能量点按照相对于该毫米波信号的发射端的向心(centripetal,CP)和离心(centrifugal,CF)两个运动方向进行分类,分别得到第一表面能量点集合(CP)和第二表面能量点集合(CF);
根据该第一表面能量点集合,确定向心检测点数(Amount s)帧序列、向心平均距离(Range s)帧序列、向心平均速度(Velocity s)帧序列;
根据该第二表面能量点集合,确定离心检测点数(Amount s)帧序列、离心平均距离(Range s)帧序列、离心平均速度(Velocity s)帧序列;
根据该多个表面能量点,确定能量质心检测点数(Amount s)帧序列、能量质心平均距离(Range s)帧序列、能量质心平均速度(Velocity s)帧序列;
根据该多个表面能量点中每个表面能量点的AoA,确定角度值α帧序列。
例如,该第一表面能量点集合中包括:SEP 1、SEP 2、SEP 3、SEP 4、SEP 5,其中,SEP 1的距离为Range 1,SEP 1的速度为Velocity 1;SEP 2的距离为Range 2,SEP 2的速度为Velocity 2;SEP 3的距离为Range 3,SEP 3的速度为Velocity 3;SEP 4的距离为Range 4,SEP 4的速度为Velocity 4;SEP 5的距离为Range 5,SEP 5的速度为Velocity 5。则向心检测点数帧序列为5,向心平均距离帧序列为(Range 1+Range 2+Range 3+Range 4+Range 5)/5,向心平均速度帧序列为(Velocity 1+Velocity 2+Velocity 3+Velocity 4+Velocity 5)/5。
又例如,该第二表面能量点集合中包括:SEP 6、SEP 7、SEP 8,其中,SEP 6的距离为Range 6,SEP 6的速度为Velocity 6;SEP 7的距离为Range 7,SEP 7的速度为Velocity 7;SEP 8的距离为Range 8,SEP 8的速度为Velocity 8。则离心检测点数帧序列为3,离心平均距离帧序列为(Range 6+Range 7+Range 8)/3,离心平均速度帧序列为(Velocity 6+Velocity 7+Velocity 8)/3。
再例如,该多个表面能量点包括:SEP 1、SEP 2、SEP 3、SEP 4、SEP 5、SEP 6、SEP 7、SEP 8,其中,SEP 1的距离为Range 1,SEP 1的速度为Velocity 1,SEP 1的角度为AoA 1;SEP 2的距离为Range 2,SEP 2的速度为Velocity 2,SEP 2的角度为AoA 2;SEP 3的距离为Range 3,SEP 3的速度为Velocity 3,SEP 3的角度为AoA 3;SEP 4的距离为Range 4,SEP 4的速度为Velocity 4,SEP 4的角度为AoA 4;SEP 5的距离为Range 5,SEP 5的速度为Velocity 5,SEP 5的角度为AoA 5;SEP 6的距离为Range 6,SEP 6的速度为Velocity 6,SEP 6的角度为AoA 6;SEP 7的距离为Range 7,SEP 7的速度为Velocity 7,SEP 7的角度为AoA 7;SEP 8的距离为Range 8,SEP 8的速度为Velocity 8,SEP 8的角度为AoA 8。则能量质心检测点数帧序列为8,能量质心平均距离帧序列为(Range 1+Range 2+Range 3+Range 4+Range 5+Range 6+Range 7+Range 8)/8,能量质心平均速度帧序列为(Velocity 1+Velocity 2+Velocity 3+Velocity 4+Velocity 5+Velocity 6+Velocity 7+Velocity 8)/8,角度值α帧序列为(AoA 1+AoA 2+AoA 3+AoA 4+AoA 5+AoA 6+AoA 7+AoA 8)/8。
可选地,在本申请实施例中,可以根据该至少两个雷达传感器之间的相位差,计算该多个表面能量点中每个表面能量点处的该手部到该发射端的距离、该手部到该发射端的AoA、该手部相对于该发射端的速度。
具体地,如图3所示,发射波为高频连续波,其频率随时间按照三角波规律变化。雷达传感器接收的回波的频率与发射的频率变化规律相同,都是三角波规律,只是有一个时间差,利用这个微小的时间差即可计算出目标物体到发射端的距离。
该手部到该发射端的AoA估算使用至少两个RX天线,如图2所示。手部与两个RX天线的距离差会导致FFT峰值的相位变化,通过相位变化来进行AoA估算。
在本申请实施例中,通过计算每一帧中向心(CP)运动的表面能量点的向心检测点数帧序列、向 心平均距离帧序列、向心平均速度帧序列之间的差值,计算每一帧中离心(CF)运动的表面能量点的离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列之间的差值,以及计算每一帧中能量质心(energy-centroid,EC)表面能量点的能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列之间的差值,最后将它们按照时间的顺序串联在一起,即可得到手部形状随手势变化而变化的独特性差异。
在双击动作中(如图4左),其随着时间变化绘制在图4(右)中。最开始敲下去的过程中,检测到的向心(CP)运动的SEPs多于离心(CF)运动,随后在预备第二次敲击的回程手势中,离心(CF)运动所检测到的SEPs多于向心(CP)运动。之后随着第二次敲击而重复第二次相同的变化。由此可见,通过PRM建模描绘的手部构型变化与实际相符。
在本申请实施例中,经由PRM解构的手势信息具有低维度与简洁的特性,为后一步的神经网络泛化模型的设计提供了便利。
可选地,在本申请实施例中,上述步骤S130具体可以是:将M帧反映该多个表面能量点的运动趋势的帧序列和M帧常数标定序列输入神经网络模型,识别该手部的手势。
例如,该神经网络模型的输入是10条时间帧序列和1条常数标定序列,10条时间帧序列分别为:向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列、离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列、能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列、角度值α帧序列。每一条序列采用20帧的长度,因此每一条手势的信息为20×11的一维矩阵。
例如,将1-20帧的一条序列输入该神经网络模型,该神经网络模型输出手势1;将2-21帧的一条序列输入该神经网络模型,该神经网络模型输出手势2;将3-22帧的一条序列输入该神经网络模型,该神经网络模型输出手势3,以此类推,在此不再赘述。
可选地,该神经网络模型为等额神经网络模型。例如,该神经网络模型为与上述PRM适配的等额神经网络模型。
可选地,如图5所示,该神经网络模型1000包括至少两个等额学习(Equal-Learning,EL)模块1010,该至少两个等额学习模块1010中每个等额学习模块从输入至输出依次包括第一卷积层1111、第一标准化(Batch Normalization)层1112、线性整流(Rectified Linear Unit,ReLU)激活函数层1113、第二卷积层1114、第二标准化层1115。
例如,为了能够不丢失任何信息地学习到特征,该至少两个等额学习模块中每个等额学习模块外部与内部的输入与输出尺寸设置均相等。
可选地,如图5所示,在该神经网络模型1000中,核为7×7的卷积层1020学习64维规格为14×7的手势信息接在该至少两个等额学习模块1010前,和/或,至少两层全连接层1030接在该至少两个等额学习模块1010后。
需要说明的是,核为7×7的卷积层学习64维规格为14×7的手势信息接在该至少两个等额学习模块前,可以确保该至少两个等额学习模块在训练过程中拥有足够的参数可被调整,增强其学习能力。
还需要说明的是,该至少两层全连接层(Full-Connected,FC)接在该至少两个等额学习模块后,以便该神经网络模型最后的特征提纯和分类。
可选地,如图5所示,在该神经网络模型1000中,最大值池化层1040接在该至少两个等额学习模块1010前。从而,可以将一些重要值移至图片中央,以增加该神经网络模型的学习能力。
可选地,如图5所示,该神经网络模型1000的输入为20×11的一维矩阵1050,该一维矩阵1050为一条手势的信息。
可选地,在本申请实施例中,上述步骤S120具体可以是:
对该至少两个雷达传感器所采集的该手势信息进行高通滤波和至少两次的快速傅氏变换(Fast Fourier Transformation,FFT)处理,得到频谱信息;根据该频谱信息,对该手部进行解构,得到离散的该多个表面能量点。
需要说明的是,高通滤波可以是从该手势信息的数据中减去该手势信息的数据的平均值并用高通滤波数据来消除所有低频。
因此,本申请实施例中,在运用PRM解构手势信息的基础上,以及基于与PRM适配的神经网络模型,不但具有优异的手势分类能力,而且所消耗的资源和时间都很小,为将此神经网络模型部署在商用手机或者类似的设备上提供了极大的可能性。
可选地,作为一个示例如图6所示,在获取到手势信息之后,依次执行步骤a、步骤b、步骤c、步骤d,即可得到所述手部的手势识别结果。步骤a,根据所述手势信息,对所述手部进行解构,得到离散的多个表面能量点;步骤b,将该多个表面能量点输入针对该手部运动的PRM,得到该多个表 面能量点的运动趋势;步骤c,将M帧反映所述多个表面能量点的运动趋势的帧序列和M帧常数标定序列输入神经网络模型;步骤d,该神经网络模型输出所述手部的手势识别结果。
可选地,在本申请实施例中,为了排除用户的非手势动作或非目标手势对于识别能力的影响,还可以进行如下操作:
建立非目标手势库,该非目标手势库包括大型的肢体或躯干动作、小型的指尖动作、其他运动轨迹的手部动作;
根据该非目标手势库和第一规则确定该手部的手势是否为目标手势。
可选地,该目标手势包括单击手势和/或双击手势。
可选地,该第一规则为:
步骤一,识别该手部的手势为目标手势的概率大于第一阈值;
步骤二,识别该手部的手势为非目标手势的概率小于第二阈值;
步骤三,同时满足步骤一和步骤二的手势分类结果为有效的识别结果。
可选地,该第一阈值为90%。可选地,该第二阈值为15%。
例如,该神经网络模型输出识别该手部的手势为目标手势的概率为95%,以及识别该手部的手势为非目标手势的概率为5%,则基于该非目标手势库和该第一规则可以确定该神经网络模型输出的手势识别结果为有效的识别结果,即该手势为目标手势。
又例如,该神经网络模型输出识别该手部的手势为目标手势的概率为75%,以及识别该手部的手势为非目标手势的概率为25%,则基于该非目标手势库和该第一规则可以确定该神经网络模型输出的手势识别结果为无效的识别结果,舍弃此次手势识别结果。
可选地,上述非目标手势库也可以是提前建立好的,或者是预先配置好。在进行手势结果有效性判断时,仅需执行:根据该非目标手势库和第一规则确定该手部的手势是否为目标手势。这一手势结果有效性判断的动作可以由一个筛选模块执行。
因此,在本申请实施例中,基于手势解构方法和为之定制的神经网络来进行手势识别,更可以被广泛应用于识别大量不同的手势。
进一步地,在本申请实施例中,基于单击手势和/或双击手势的手势识别,可以实现鼠标模拟,提高了该系统的实用性,为智能手机、平板电脑带来新的操作模式的可能。
图7是本申请一个实施例的定位追踪的方法200的示意性流程图。应理解,图7示出了该方法200的步骤或操作,但这些步骤或操作仅是示例,本申请实施例还可以执行其他操作或者图7中的各个操作的变形。该方法200可以由定位追踪的装置执行,其中,该定位追踪的装置可以是手机、平板电脑、便携式电脑、PDA等等,或者,该定位追踪的装置可以是手机中的一个模块或者系统、平板电脑中的一个模块或者系统、便携式电脑中的一个模块或者系统、PDA中的一个模块或者系统等等。
具体地,该定位追踪的方法200包括:
S210,获取毫米波信号照射目标物体且经该目标物体反射后被至少两个雷达传感器采集的一帧混频信号;
S220,根据该至少两个雷达传感器所采集的混频信号确定频谱信息;
S230,检测该频谱信息,得到多个峰值点;
S240,对该多个峰值点进行去噪声处理,以在该多个峰值点中确定第一峰值点;
S250,根据该第一峰值点处该目标物体到该至少两个雷达传感器的距离和AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标。
需要说明的是,该目标物体可以为小型物体,例如手部,或者,手部的某一位置或者区域。由于目标物体较小,因而会存在相对强烈的环境干扰,本申请实施例通过避免或修正了这些环境干扰,使得能够获得准确无干扰的定位坐标,顺利完成目标物体实时的定位追踪。
可选地,如图8所示,该毫米波信号可以是由发射端设备(TX天线)发射,毫米波信号照射目标物体且经该目标物体反射后,被该至少两个雷达传感器(RX天线)采集,该至少两个雷达传感器每次采集一帧混频信号。
可选地,在本申请实施例中,在根据该第一峰值点处的该距离和该AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标之前,即在上述步骤S250之前,该方法200还包括:
判断该第一峰值点处的该距离和/或该AoA是否能够正确反映该目标物体的位置坐标;
其中,若该第一峰值点处的该距离和/或该AoA能够正确反映该目标物体的位置坐标,根据该第一峰值点处的该距离和该AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标;或者若该第一峰值点处的该距离和/或该AoA不能反映该目标物体的位置坐标,舍弃这一帧的混频信号。
具体地,可以通过如下方式,判断该第一峰值点处的该距离和/或该AoA是否能够正确反映该目 标物体的位置坐标:
若该第一峰值点处的该距离与第一点处的该距离之差的绝对值大于第一阈值,或者,该第一峰值点处的该AoA与第一点处的该AoA之差的绝对值大于第二阈值,则判断该第一峰值点处该距离和/或该AoA不能正确反映该目标物体的位置坐标;和/或,
若该第一峰值点处的该距离与第一点处的该距离之差的绝对值小于或者等于第一阈值,或者,该第一峰值点处的该AoA与第一点处的该AoA之差的绝对值小于或者等于第二阈值,则判断该第一峰值点处的该距离和/或该AoA能够正确反映该目标物体的位置坐标;
其中,该第一点为上一次能够正确反映该目标物体位置坐标的峰值点。
应理解,该第一点也可以称之为上一次被选出的点(lastPoint)
可选地,该第一阈值为0.1m。可选地,该第二阈值为20度。
需要说明的是,该第一阈值和该第二阈值可以根据实际情况灵活设置。
因此,通过判断第一峰值点处的距离和/或AoA是否能够正确反映目标物体的位置坐标,可以过滤一些突发异常噪声,如突然到来的高强反射物体。
可选地,在本申请实施例中,上述步骤S240具体可以是:
将该多个峰值点中距离第一点最近的峰值点确定为该第一峰值点,该第一点为上一次能够正确反映该目标物体位置坐标的峰值点。
由于该第一点为上一次能够正确反映该目标物体位置坐标的峰值点,考虑到该目标物体的移动速度和每一帧之间时间间隔,所以可以通过将该多个峰值点中距离该第一点最近的峰值点确定为该第一峰值点,来排除噪声峰值点。
可选地,在本申请实施例中,可以根据前K帧混频信号确定初始化的该第一点,K为正整数。即初始化该第一点。可选地,K≥5。例如,K为5、10、15或20。
由于需要与该第一点进行对比,因此需要确定初始的该第一点的位置,即需要初始化lastPoint的值。本申请实施例采用放行前K帧的做法,即在K帧前不做阈值筛选,从而找到一个可用的初始点。对于前K帧的数据选择弃用,由于K帧只是很短的瞬间,因此在初始化时用户并不会感到延迟。
可选地,在本申请实施例中,可以对该第一峰值点处的该AoA进行平滑处理。
具体地,对该第一峰值点处的该AoA、第一点处的该AoA和第二点处的该AoA取均值,以平滑该第一峰值点处的该AoA的抖动,其中,该第一点为上一次能够正确反映该目标物体位置坐标的峰值点,该第二点为上上一次能够正确反映该目标物体位置坐标的峰值点。
需要说明的是,由于雷达传感器芯片的局限性,得到的AoA值会有一定的抖动,本申请实施例采用将最近3次的AoA取均值的方法,很好的平滑了抖动现象,得到优良的定位轨迹。
可选地,在本申请实施例中,可以根据该至少两个雷达传感器之间的相位差,计算该多个峰值点中每个峰值点处该目标物体到该至少两个雷达传感器的距离和AoA;或者,根据该至少两个雷达传感器之间的相位差,计算该第一峰值点处该目标物体到该至少两个雷达传感器的距离和AoA。
具体地,如上图3所示,发射波为高频连续波,其频率随时间按照三角波规律变化。雷达传感器接收的回波的频率与发射的频率变化规律相同,都是三角波规律,只是有一个时间差,利用这个微小的时间差即可计算出目标物体到发射端的距离。
该目标物体到该发射端的AoA估算使用至少两个RX天线,如图8所示。目标物体与两个RX天线的距离差会导致FFT峰值的相位变化,通过相位变化来进行AoA估算。
可选地,在本申请实施例中,上述步骤S230具体可以包括:
在该频谱信息中划定检测区域;
在该检测区域中检测,得到信号强度大于第一门限值的该多个峰值点。
在检测区域内找出信号强度大于第一门限值的该多个峰值点,峰值用于确定物体的距离和信号强度。由于毫米波对变化的分辨率极高,再加上环境物体的干扰,因此会在检测区域内形成多个检测点,表现为出现多个峰值点,峰值点处即为目标物体存在的位置或者噪声。
可选地,在本申请实施例中,上述步骤S220具体可以包括:
对该至少两个雷达传感器所采集的混频信号进行高通滤波和FFT处理,得到该频谱信息。
需要说明的是,高通滤波可以是从该手势信息的数据中减去该手势信息的数据的平均值并用高通滤波数据来消除所有低频。
因此,在本申请实施例中,能够避免或修正环境干扰,获得准确无干扰的定位坐标,顺利完成实时的定位追踪。进一步地,去除了RFID对待追踪物体的限制,扩大了使用场景。
可选地,如图9所示,本申请实施例提供了一种手势识别的设备300,该设备300包括:
获取单元310,用于获取毫米波信号照射用户手部且经该手部反射后被至少两个雷达传感器采集 的手势信息;
处理单元320,用于根据该手势信息,对该手部进行解构,得到离散的多个表面能量点;
该处理单元320,还用于根据该多个表面能量点的运动趋势,识别该手部的手势。
可选地,该多个表面能量点的运动趋势通过M帧以下帧序列中的至少一种反映,M为正整数:
向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列、离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列、能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列、角度值α帧序列。
可选地,M=20。
可选地,该处理单元320具体用于:
将M帧反映该多个表面能量点的运动趋势的帧序列和M帧常数标定序列输入神经网络模型,识别该手部的手势。
可选地,该神经网络模型为等额神经网络模型。
可选地,该神经网络模型包括至少两个等额学习模块,该至少两个等额学习模块中每个等额学习模块从输入至输出依次包括第一卷积层、第一标准化层、线性整流激活函数层、第二卷积层、第二标准化层。
可选地,该至少两个等额学习模块中每个等额学习模块外部与内部的输入与输出尺寸设置均相等。
可选地,在该神经网络模型中,核为7×7的卷积层学习64维规格为14×7的手势信息接在该至少两个等额学习模块前,和/或,至少两层全连接层接在该至少两个等额学习模块后。
可选地,在该神经网络模型中,最大值池化层接在该至少两个等额学习模块前。
可选地,该处理单元320还用于将该多个表面能量点输入针对该手部运动的伪具象化模型,得到该多个表面能量点的运动趋势。
可选地,该处理单元320具体用于:
将该多个表面能量点按照相对于该毫米波信号的发射端的向心和离心两个运动方向进行分类,分别得到第一表面能量点集合和第二表面能量点集合;
根据该第一表面能量点集合,确定向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列;
根据该第二表面能量点集合,确定离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列;
根据该多个表面能量点,确定能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列;
根据该多个表面能量点中每个表面能量点的到达角AoA,确定角度值α帧序列。
可选地,该处理单元320还用于
计算该多个表面能量点中每个表面能量点处该手部到该发射端的距离、该手部到该发射端的AoA、该手部相对于该发射端的速度。
可选地,该处理单元320具体用于:
根据该至少两个雷达传感器之间的相位差,计算该多个表面能量点中每个表面能量点处的该手部到该发射端的距离、该手部到该发射端的AoA、该手部相对于该发射端的速度。
可选地,该处理单元320具体用于:
对该至少两个雷达传感器所采集的该手势信息进行高通滤波和至少两次的快速傅氏变换FFT处理,得到频谱信息;
根据该频谱信息,对该手部进行解构,得到离散的该多个表面能量点。
可选地,该多个表面能量点的能量值大于第一门限值。
可选地,该处理单元320还用于:
建立非目标手势库,该非目标手势库包括大型的肢体或躯干动作、小型的指尖动作、其他运动轨迹的手部动作;
根据该非目标手势库和第一规则确定该手部的手势是否为目标手势。
可选地,该目标手势包括单击手势和/或双击手势。
可选地,该第一规则为:
步骤一,识别该手部的手势为目标手势的概率大于第一阈值;
步骤二,识别该手部的手势为非目标手势的概率小于第二阈值;
步骤三,同时满足步骤一和步骤二的手势分类结果为有效的识别结果。
可选地,该第一阈值为90%。
可选地,该第二阈值为15%。
应理解,根据本申请实施例的手势识别的设备300可对应于本申请方法实施例,并且手势识别的设备300中的各个单元的上述和其它操作和/或功能分别为了实现图1所示方法100中的相应流程,为了简洁,在此不再赘述。
可选地,如图10所示,本申请实施例提供了一种定位追踪的设备400,该设备400包括:
获取单元410,用于获取毫米波信号照射目标物体且经该目标物体反射后被至少两个雷达传感器采集的一帧混频信号;
处理单元420,用于根据该至少两个雷达传感器所采集的混频信号确定频谱信息;
该处理单元420,还用于检测该频谱信息,得到多个峰值点;
该处理单元420,还用于对该多个峰值点进行去噪声处理,以在该多个峰值点中确定第一峰值点;
该处理单元420,还用于根据该第一峰值点处该目标物体到该至少两个雷达传感器的距离和AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标。
可选地,在该处理单元420根据该第一峰值点处的该距离和该AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标之前,该处理单元420还用于:
判断该第一峰值点处的该距离和/或该AoA是否能够正确反映该目标物体的位置坐标;
其中,若该第一峰值点处的该距离和/或该AoA能够正确反映该目标物体的位置坐标,根据该第一峰值点处的该距离和该AoA计算该目标物体在二维平面上的直角坐标系上的位置坐标;或者若该第一峰值点处的该距离和/或该AoA不能反映该目标物体的位置坐标,舍弃这一帧的混频信号。
可选地,该处理单元420具体用于:
若该第一峰值点处的该距离与第一点处的该距离之差的绝对值大于第一阈值,或者,该第一峰值点处的该AoA与第一点处的该AoA之差的绝对值大于第二阈值,则判断该第一峰值点处该距离和/或该AoA不能正确反映该目标物体的位置坐标;和/或,
若该第一峰值点处的该距离与第一点处的该距离之差的绝对值小于或者等于第一阈值,或者,该第一峰值点处的该AoA与第一点处的该AoA之差的绝对值小于或者等于第二阈值,则判断该第一峰值点处的该距离和/或该AoA能够正确反映该目标物体的位置坐标;
其中,该第一点为上一次能够正确反映该目标物体位置坐标的峰值点。
可选地,该第一阈值为0.1m。
可选地,该第二阈值为20度。
可选地,该处理单元420具体用于:
将该多个峰值点中距离第一点最近的峰值点确定为该第一峰值点,该第一点为上一次能够正确反映该目标物体位置坐标的峰值点。
可选地,该处理单元420还用于根据前K帧混频信号确定初始化的该第一点,K为正整数。
可选地,K≥5。
可选地,该处理单元420还用于对该第一峰值点处的该AoA进行平滑处理。
可选地,该处理单元420还用于:
对该第一峰值点处的该AoA、第一点处的该AoA和第二点处的该AoA取均值,以平滑该第一峰值点处的该AoA的抖动,其中,该第一点为上一次能够正确反映该目标物体位置坐标的峰值点,该第二点为上上一次能够正确反映该目标物体位置坐标的峰值点。
可选地,该处理单元420还用于:
根据该至少两个雷达传感器之间的相位差,计算该多个峰值点中每个峰值点处该目标物体到该至少两个雷达传感器的距离和AoA;或者
根据该至少两个雷达传感器之间的相位差,计算该第一峰值点处该目标物体到该至少两个雷达传感器的距离和AoA。
可选地,该处理单元420具体用于:
在该频谱信息中划定检测区域;
在该检测区域中检测,得到信号强度大于第一门限值的该多个峰值点。
可选地,该处理单元420具体用于:
对该至少两个雷达传感器所采集的混频信号进行高通滤波和FFT处理,得到该频谱信息。
应理解,根据本申请实施例的定位追踪的设备400可对应于本申请方法实施例,并且定位追踪的设备400中的各个单元的上述和其它操作和/或功能分别为了实现图7所示方法200中的相应流程,为了简洁,在此不再赘述。
可选地,如图11所示,本申请实施例提供了一种手势识别的装置500,该手势识别的装置500包括:
存储器510,用于存储程序和数据;以及
处理器520,用于调用并运行所述存储器中存储的程序和数据;
该装置500被配置为执行上述图1至6中所示的方法。
可选地,如图12所示,本申请实施例提供了一种定位追踪的装置600,该定位追踪的装置600包括:
存储器610,用于存储程序和数据;以及
处理器620,用于调用并运行所述存储器中存储的程序和数据;
该装置600被配置为执行上述图7至8中所示的方法。
可选地,如图13所示,本申请实施例提供了一种手势识别的系统700,包括:
发射端设备710,用于发射毫米波信号;
至少两个雷达传感器720,用于采集毫米波信号照射用户手部且经所述手部反射后的手势信息;
装置730,包括用于存储程序和数据的存储器731和用于调用并运行所述存储器中存储的程序和数据的处理器732,所述装置730被配置为执行上述图1至6中所示的方法。
可选地,如图14所示,本申请实施例提供了一种定位追踪的系统800,包括:
发射端设备810,用于发射毫米波信号;
至少两个雷达传感器820,用于采集毫米波信号照射目标物体且经所述目标物体反射后的一帧混频信号;
装置830,包括用于存储程序和数据的存储器831和用于调用并运行所述存储器中存储的程序和数据的处理器832,所述装置830被配置为执行上述图7至8中所示的方法。
应理解,本申请实施例的处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
可以理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
应理解,上述存储器为示例性但不是限制性说明,例如,本申请实施例中的存储器还可以是静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)以及直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)等等。也就是说,本申请实施例中的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本申请实施例还提供了一种计算机可读存储介质,用于存储计算机程序。
可选的,该计算机可读存储介质可应用于本申请实施例中的手势识别的装置,并且该计算机程序使得计算机执行本申请实施例的各个方法中由手势识别的装置实现的相应流程,为了简洁,在此不再 赘述。
可选地,该计算机可读存储介质可应用于本申请实施例中的定位追踪的装置,并且该计算机程序使得计算机执行本申请实施例的各个方法中由定位追踪的装置实现的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供了一种计算机程序产品,包括计算机程序指令。
可选的,该计算机程序产品可应用于本申请实施例中的手势识别的装置,并且该计算机程序指令使得计算机执行本申请实施例的各个方法中由手势识别的装置实现的相应流程,为了简洁,在此不再赘述。
可选地,该计算机程序产品可应用于本申请实施例中的定位追踪的装置,并且该计算机程序指令使得计算机执行本申请实施例的各个方法中由定位追踪的装置实现的相应流程,为了简洁,在此不再赘述。
本申请实施例还提供了一种计算机程序。
可选的,该计算机程序可应用于本申请实施例中的手势识别的装置,当该计算机程序在计算机上运行时,使得计算机执行本申请实施例的各个方法中由手势识别的装置实现的相应流程,为了简洁,在此不再赘述。
可选地,该计算机程序可应用于本申请实施例中的定位追踪的装置,当该计算机程序在计算机上运行时,使得计算机执行本申请实施例的各个方法中由定位追踪的装置实现的相应流程,为了简洁,在此不再赘述。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
应理解,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (76)

  1. 一种手势识别的方法,其特征在于,包括:
    获取毫米波信号照射用户手部且经所述手部反射后被至少两个雷达传感器采集的手势信息;
    根据所述手势信息,对所述手部进行解构,得到离散的多个表面能量点;
    根据所述多个表面能量点的运动趋势,识别所述手部的手势。
  2. 根据权利要求1所述的方法,其特征在于,所述多个表面能量点的运动趋势通过M帧以下帧序列中的至少一种反映,M为正整数:
    向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列、离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列、能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列、角度值α帧序列。
  3. 根据权利要求2所述的方法,其特征在于,M=20。
  4. 根据权利要求2或3所述的方法,其特征在于,所述根据所述多个表面能量点的运动趋势,识别所述手部的手势,包括:
    将M帧反映所述多个表面能量点的运动趋势的帧序列和M帧常数标定序列输入神经网络模型,识别所述手部的手势。
  5. 根据权利要求4所述的方法,其特征在于,所述神经网络模型为等额神经网络模型。
  6. 根据权利要求5所述的方法,其特征在于,所述神经网络模型包括至少两个等额学习模块,所述至少两个等额学习模块中每个等额学习模块从输入至输出依次包括第一卷积层、第一标准化层、线性整流激活函数层、第二卷积层、第二标准化层。
  7. 根据权利要求6所述的方法,其特征在于,所述至少两个等额学习模块中每个等额学习模块外部与内部的输入与输出尺寸设置均相等。
  8. 根据权利要求6或7所述的方法,其特征在于,在所述神经网络模型中,核为7×7的卷积层学习64维规格为14×7的手势信息接在所述至少两个等额学习模块前,和/或,至少两层全连接层接在所述至少两个等额学习模块后。
  9. 根据权利要求6至8中任一项所述的方法,其特征在于,在所述神经网络模型中,最大值池化层接在所述至少两个等额学习模块前。
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,所述方法还包括:
    将所述多个表面能量点输入针对所述手部运动的伪具象化模型,得到所述多个表面能量点的运动趋势。
  11. 根据权利要求10所述的方法,其特征在于,所述将所述多个表面能量点输入针对所述手部运动的伪具象化模型,得到所述多个表面能量点的运动趋势,包括:
    将所述多个表面能量点按照相对于所述毫米波信号的发射端的向心和离心两个运动方向进行分类,分别得到第一表面能量点集合和第二表面能量点集合;
    根据所述第一表面能量点集合,确定向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列;
    根据所述第二表面能量点集合,确定离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列;
    根据所述多个表面能量点,确定能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列;
    根据所述多个表面能量点中每个表面能量点的到达角AoA,确定角度值α帧序列。
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    计算所述多个表面能量点中每个表面能量点处所述手部到所述发射端的距离、所述手部到所述发射端的AoA、所述手部相对于所述发射端的速度。
  13. 根据权利要求12所述的方法,其特征在于,所述计算所述多个表面能量点中每个表面能量点处所述手部到所述发射端的距离、所述手部到所述发射端的AoA、所述手部相对于所述发射端的速度,包括:
    根据所述至少两个雷达传感器之间的相位差,计算所述多个表面能量点中每个表面能量点处的所述手部到所述发射端的距离、所述手部到所述发射端的AoA、所述手部相对于所述发射端的速度。
  14. 根据权利要求1至13中任一项所述的方法,其特征在于,所述根据所述手势信息,对所述手部进行解构,得到离散的多个表面能量点,包括:
    对所述至少两个雷达传感器所采集的所述手势信息进行高通滤波和至少两次的快速傅氏变换FFT处理,得到频谱信息;
    根据所述频谱信息,对所述手部进行解构,得到离散的所述多个表面能量点。
  15. 根据权利要求14所述的方法,其特征在于,所述多个表面能量点的能量值大于第一门限值。
  16. 根据权利要求1至15中任一项所述的方法,其特征在于,所述方法还包括:
    建立非目标手势库,所述非目标手势库包括大型的肢体或躯干动作、小型的指尖动作、其他运动轨迹的手部动作;
    根据所述非目标手势库和第一规则确定所述手部的手势是否为目标手势。
  17. 根据权利要求16所述的方法,其特征在于,所述目标手势包括单击手势和/或双击手势。
  18. 根据权利要求17所述的方法,其特征在于,所述第一规则为:
    步骤一,识别所述手部的手势为目标手势的概率大于第一阈值;
    步骤二,识别所述手部的手势为非目标手势的概率小于第二阈值;
    步骤三,同时满足步骤一和步骤二的手势分类结果为有效的识别结果。
  19. 根据权利要求18所述的方法,其特征在于,所述第一阈值为90%。
  20. 根据权利要求18或19所述的方法,其特征在于,所述第二阈值为15%。
  21. 一种定位追踪的方法,其特征在于,包括:
    获取毫米波信号照射目标物体且经所述目标物体反射后被至少两个雷达传感器采集的一帧混频信号;
    根据所述至少两个雷达传感器所采集的混频信号确定频谱信息;
    检测所述频谱信息,得到多个峰值点;
    对所述多个峰值点进行去噪声处理,以在所述多个峰值点中确定第一峰值点;
    根据所述第一峰值点处所述目标物体到所述至少两个雷达传感器的距离和到达角AoA计算所述目标物体在二维平面上的直角坐标系上的位置坐标。
  22. 根据权利要求21所述的方法,其特征在于,在根据所述第一峰值点处的所述距离和所述AoA计算所述目标物体在二维平面上的直角坐标系上的位置坐标之前,所述方法还包括:
    判断所述第一峰值点处的所述距离和/或所述AoA是否能够正确反映所述目标物体的位置坐标;
    其中,若所述第一峰值点处的所述距离和/或所述AoA能够正确反映所述目标物体的位置坐标,根据所述第一峰值点处的所述距离和所述AoA计算所述目标物体在二维平面上的直角坐标系上的位置坐标;或者若所述第一峰值点处的所述距离和/或所述AoA不能反映所述目标物体的位置坐标,舍弃这一帧的混频信号。
  23. 根据权利要求22所述的方法,其特征在于,所述判断所述第一峰值点处的所述距离和/或所述AoA是否能够正确反映所述目标物体的位置坐标,包括:
    若所述第一峰值点处的所述距离与第一点处的所述距离之差的绝对值大于第一阈值,或者,所述第一峰值点处的所述AoA与第一点处的所述AoA之差的绝对值大于第二阈值,则判断所述第一峰值点处所述距离和/或所述AoA不能正确反映所述目标物体的位置坐标;和/或,
    若所述第一峰值点处的所述距离与第一点处的所述距离之差的绝对值小于或者等于第一阈值,或者,所述第一峰值点处的所述AoA与第一点处的所述AoA之差的绝对值小于或者等于第二阈值,则判断所述第一峰值点处的所述距离和/或所述AoA能够正确反映所述目标物体的位置坐标;
    其中,所述第一点为上一次能够正确反映所述目标物体位置坐标的峰值点。
  24. 根据权利要求23所述的方法,其特征在于,所述第一阈值为0.1m。
  25. 根据权利要求23或24所述的方法,其特征在于,所述第二阈值为20度。
  26. 根据权利要求21至25中任一项所述的方法,其特征在于,所述对所述多个峰值点进行去噪声处理,以在所述多个峰值点中确定第一峰值点,包括:
    将所述多个峰值点中距离第一点最近的峰值点确定为所述第一峰值点,所述第一点为上一次能够正确反映所述目标物体位置坐标的峰值点。
  27. 根据权利要求22至26中任一项所述的方法,其特征在于,所述方法还包括:
    根据前K帧混频信号确定初始化的所述第一点,K为正整数。
  28. 根据权利要求27所述的方法,其特征在于,K≥5。
  29. 根据权利要求21至28中任一项所述的方法,其特征在于,所述方法还包括:
    对所述第一峰值点处的所述AoA进行平滑处理。
  30. 根据权利要求29所述的方法,其特征在于,所述对所述第一峰值点处的所述AoA进行平滑处理,包括:
    对所述第一峰值点处的所述AoA、第一点处的所述AoA和第二点处的所述AoA取均值,以平滑所述第一峰值点处的所述AoA的抖动,其中,所述第一点为上一次能够正确反映所述目标物体位置 坐标的峰值点,所述第二点为上上一次能够正确反映所述目标物体位置坐标的峰值点。
  31. 根据权利要求21至30中任一项所述的方法,其特征在于,所述方法还包括:
    根据所述至少两个雷达传感器之间的相位差,计算所述多个峰值点中每个峰值点处所述目标物体到所述至少两个雷达传感器的距离和AoA;或者
    根据所述至少两个雷达传感器之间的相位差,计算所述第一峰值点处所述目标物体到所述至少两个雷达传感器的距离和AoA。
  32. 根据权利要求21至31中任一项所述的方法,其特征在于,所述检测所述频谱信息,得到多个峰值点,包括:
    在所述频谱信息中划定检测区域;
    在所述检测区域中检测,得到信号强度大于第一门限值的所述多个峰值点。
  33. 根据权利要求21至32中任一项所述的方法,其特征在于,所述根据所述至少两个雷达传感器所采集的混频信号确定频谱信息,包括:
    对所述至少两个雷达传感器所采集的混频信号进行高通滤波和快速傅氏变换FFT处理,得到所述频谱信息。
  34. 一种手势识别的设备,其特征在于,包括:
    获取单元,用于获取毫米波信号照射用户手部且经所述手部反射后被至少两个雷达传感器采集的手势信息;
    处理单元,用于根据所述手势信息,对所述手部进行解构,得到离散的多个表面能量点;
    所述处理单元,还用于根据所述多个表面能量点的运动趋势,识别所述手部的手势。
  35. 根据权利要求34所述的设备,其特征在于,所述多个表面能量点的运动趋势通过M帧以下帧序列中的至少一种反映,M为正整数:
    向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列、离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列、能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列、角度值α帧序列。
  36. 根据权利要求35所述的设备,其特征在于,M=20。
  37. 根据权利要求35或36所述的设备,其特征在于,所述处理单元具体用于:
    将M帧反映所述多个表面能量点的运动趋势的帧序列和M帧常数标定序列输入神经网络模型,识别所述手部的手势。
  38. 根据权利要求37所述的设备,其特征在于,所述神经网络模型为等额神经网络模型。
  39. 根据权利要求38所述的设备,其特征在于,所述神经网络模型包括至少两个等额学习模块,所述至少两个等额学习模块中每个等额学习模块从输入至输出依次包括第一卷积层、第一标准化层、线性整流激活函数层、第二卷积层、第二标准化层。
  40. 根据权利要求39所述的设备,其特征在于,所述至少两个等额学习模块中每个等额学习模块外部与内部的输入与输出尺寸设置均相等。
  41. 根据权利要求39或40所述的设备,其特征在于,在所述神经网络模型中,核为7×7的卷积层学习64维规格为14×7的手势信息接在所述至少两个等额学习模块前,和/或,至少两层全连接层接在所述至少两个等额学习模块后。
  42. 根据权利要求39至41中任一项所述的设备,其特征在于,在所述神经网络模型中,最大值池化层接在所述至少两个等额学习模块前。
  43. 根据权利要求34至42中任一项所述的设备,其特征在于,所述处理单元还用于将所述多个表面能量点输入针对所述手部运动的伪具象化模型,得到所述多个表面能量点的运动趋势。
  44. 根据权利要求43所述的设备,其特征在于,所述处理单元具体用于:
    将所述多个表面能量点按照相对于所述毫米波信号的发射端的向心和离心两个运动方向进行分类,分别得到第一表面能量点集合和第二表面能量点集合;
    根据所述第一表面能量点集合,确定向心检测点数帧序列、向心平均距离帧序列、向心平均速度帧序列;
    根据所述第二表面能量点集合,确定离心检测点数帧序列、离心平均距离帧序列、离心平均速度帧序列;
    根据所述多个表面能量点,确定能量质心检测点数帧序列、能量质心平均距离帧序列、能量质心平均速度帧序列;
    根据所述多个表面能量点中每个表面能量点的到达角AoA,确定角度值α帧序列。
  45. 根据权利要求44所述的设备,其特征在于,所述处理单元还用于
    计算所述多个表面能量点中每个表面能量点处所述手部到所述发射端的距离、所述手部到所述发射端的AoA、所述手部相对于所述发射端的速度。
  46. 根据权利要求45所述的设备,其特征在于,所述处理单元具体用于:
    根据所述至少两个雷达传感器之间的相位差,计算所述多个表面能量点中每个表面能量点处的所述手部到所述发射端的距离、所述手部到所述发射端的AoA、所述手部相对于所述发射端的速度。
  47. 根据权利要求34至46中任一项所述的设备,其特征在于,所述处理单元具体用于:
    对所述至少两个雷达传感器所采集的所述手势信息进行高通滤波和至少两次的快速傅氏变换FFT处理,得到频谱信息;
    根据所述频谱信息,对所述手部进行解构,得到离散的所述多个表面能量点。
  48. 根据权利要求47所述的设备,其特征在于,所述多个表面能量点的能量值大于第一门限值。
  49. 根据权利要求34至48中任一项所述的设备,其特征在于,所述处理单元还用于:
    建立非目标手势库,所述非目标手势库包括大型的肢体或躯干动作、小型的指尖动作、其他运动轨迹的手部动作;
    根据所述非目标手势库和第一规则确定所述手部的手势是否为目标手势。
  50. 根据权利要求49所述的设备,其特征在于,所述目标手势包括单击手势和/或双击手势。
  51. 根据权利要求50所述的设备,其特征在于,所述第一规则为:
    步骤一,识别所述手部的手势为目标手势的概率大于第一阈值;
    步骤二,识别所述手部的手势为非目标手势的概率小于第二阈值;
    步骤三,同时满足步骤一和步骤二的手势分类结果为有效的识别结果。
  52. 根据权利要求51所述的设备,其特征在于,所述第一阈值为90%。
  53. 根据权利要求51或52所述的设备,其特征在于,所述第二阈值为15%。
  54. 一种定位追踪的设备,其特征在于,包括:
    获取单元,用于获取毫米波信号照射目标物体且经所述目标物体反射后被至少两个雷达传感器采集的一帧混频信号;
    处理单元,用于根据所述至少两个雷达传感器所采集的混频信号确定频谱信息;
    所述处理单元,还用于检测所述频谱信息,得到多个峰值点;
    所述处理单元,还用于对所述多个峰值点进行去噪声处理,以在所述多个峰值点中确定第一峰值点;
    所述处理单元,还用于根据所述第一峰值点处所述目标物体到所述至少两个雷达传感器的距离和到达角AoA计算所述目标物体在二维平面上的直角坐标系上的位置坐标。
  55. 根据权利要求54所述的设备,其特征在于,在所述处理单元根据所述第一峰值点处的所述距离和所述AoA计算所述目标物体在二维平面上的直角坐标系上的位置坐标之前,所述处理单元还用于:
    判断所述第一峰值点处的所述距离和/或所述AoA是否能够正确反映所述目标物体的位置坐标;
    其中,若所述第一峰值点处的所述距离和/或所述AoA能够正确反映所述目标物体的位置坐标,根据所述第一峰值点处的所述距离和所述AoA计算所述目标物体在二维平面上的直角坐标系上的位置坐标;或者若所述第一峰值点处的所述距离和/或所述AoA不能反映所述目标物体的位置坐标,舍弃这一帧的混频信号。
  56. 根据权利要求55所述的设备,其特征在于,所述处理单元具体用于:
    若所述第一峰值点处的所述距离与第一点处的所述距离之差的绝对值大于第一阈值,或者,所述第一峰值点处的所述AoA与第一点处的所述AoA之差的绝对值大于第二阈值,则判断所述第一峰值点处所述距离和/或所述AoA不能正确反映所述目标物体的位置坐标;和/或,
    若所述第一峰值点处的所述距离与第一点处的所述距离之差的绝对值小于或者等于第一阈值,或者,所述第一峰值点处的所述AoA与第一点处的所述AoA之差的绝对值小于或者等于第二阈值,则判断所述第一峰值点处的所述距离和/或所述AoA能够正确反映所述目标物体的位置坐标;
    其中,所述第一点为上一次能够正确反映所述目标物体位置坐标的峰值点。
  57. 根据权利要求56所述的设备,其特征在于,所述第一阈值为0.1m。
  58. 根据权利要求56或57所述的设备,其特征在于,所述第二阈值为20度。
  59. 根据权利要求54至58中任一项所述的设备,其特征在于,所述处理单元具体用于:
    将所述多个峰值点中距离第一点最近的峰值点确定为所述第一峰值点,所述第一点为上一次能够正确反映所述目标物体位置坐标的峰值点。
  60. 根据权利要求55至59中任一项所述的设备,其特征在于,所述处理单元还用于根据前K帧 混频信号确定初始化的所述第一点,K为正整数。
  61. 根据权利要求60所述的设备,其特征在于,K≥5。
  62. 根据权利要求54至61中任一项所述的设备,其特征在于,所述处理单元还用于对所述第一峰值点处的所述AoA进行平滑处理。
  63. 根据权利要求62所述的设备,其特征在于,所述处理单元还用于:
    对所述第一峰值点处的所述AoA、第一点处的所述AoA和第二点处的所述AoA取均值,以平滑所述第一峰值点处的所述AoA的抖动,其中,所述第一点为上一次能够正确反映所述目标物体位置坐标的峰值点,所述第二点为上上一次能够正确反映所述目标物体位置坐标的峰值点。
  64. 根据权利要求54至63中任一项所述的设备,其特征在于,所述处理单元还用于:
    根据所述至少两个雷达传感器之间的相位差,计算所述多个峰值点中每个峰值点处所述目标物体到所述至少两个雷达传感器的距离和AoA;或者
    根据所述至少两个雷达传感器之间的相位差,计算所述第一峰值点处所述目标物体到所述至少两个雷达传感器的距离和AoA。
  65. 根据权利要求54至64中任一项所述的设备,其特征在于,所述处理单元具体用于:
    在所述频谱信息中划定检测区域;
    在所述检测区域中检测,得到信号强度大于第一门限值的所述多个峰值点。
  66. 根据权利要求54至65中任一项所述的设备,其特征在于,所述处理单元具体用于:
    对所述至少两个雷达传感器所采集的混频信号进行高通滤波和快速傅氏变换FFT处理,得到所述频谱信息。
  67. 一种手势识别的装置,其特征在于,包括:
    存储器,用于存储程序和数据;以及
    处理器,用于调用并运行所述存储器中存储的程序和数据;
    所述装置被配置为:执行如权利要求1至20中任一项所述的方法。
  68. 一种定位追踪的装置,其特征在于,包括:
    存储器,用于存储程序和数据;以及
    处理器,用于调用并运行所述存储器中存储的程序和数据;
    所述装置被配置为:执行如权利要求21至33中任一项所述的方法。
  69. 一种手势识别的系统,其特征在于,包括:
    发射端设备,用于发射毫米波信号;
    至少两个雷达传感器,用于采集毫米波信号照射用户手部且经所述手部反射后的手势信息;
    装置,包括用于存储程序和数据的存储器和用于调用并运行所述存储器中存储的程序和数据的处理器,所述装置被配置为执行如权利要求1至20中任一项所述的方法。
  70. 一种定位追踪的系统,其特征在于,包括:
    发射端设备,用于发射毫米波信号;
    至少两个雷达传感器,用于采集毫米波信号照射目标物体且经所述目标物体反射后的一帧混频信号;
    装置,包括用于存储程序和数据的存储器和用于调用并运行所述存储器中存储的程序和数据的处理器,所述装置被配置为执行如权利要求21至33中任一项所述的方法。
  71. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1至20中任一项所述的方法。
  72. 一种计算机可读存储介质,其特征在于,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求21至33中任一项所述的方法。
  73. 一种计算机程序产品,其特征在于,包括计算机程序指令,该计算机程序指令使得计算机执行如权利要求1至20中任一项所述的方法。
  74. 一种计算机程序产品,其特征在于,包括计算机程序指令,该计算机程序指令使得计算机执行如权利要求21至33中任一项所述的方法。
  75. 一种计算机程序,其特征在于,所述计算机程序使得计算机执行如权利要求1至20中任一项所述的方法。
  76. 一种计算机程序,其特征在于,所述计算机程序使得计算机执行如权利要求21至33中任一项所述的方法。
PCT/CN2019/093126 2019-06-26 2019-06-26 手势识别的方法和设备、定位追踪的方法和设备 WO2020258106A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/093126 WO2020258106A1 (zh) 2019-06-26 2019-06-26 手势识别的方法和设备、定位追踪的方法和设备
CN201980002838.5A CN110741385B (zh) 2019-06-26 2019-06-26 手势识别的方法和设备、定位追踪的方法和设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/093126 WO2020258106A1 (zh) 2019-06-26 2019-06-26 手势识别的方法和设备、定位追踪的方法和设备

Publications (1)

Publication Number Publication Date
WO2020258106A1 true WO2020258106A1 (zh) 2020-12-30

Family

ID=69274579

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/093126 WO2020258106A1 (zh) 2019-06-26 2019-06-26 手势识别的方法和设备、定位追踪的方法和设备

Country Status (2)

Country Link
CN (1) CN110741385B (zh)
WO (1) WO2020258106A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113208566A (zh) * 2021-05-17 2021-08-06 深圳大学 一种数据处理方法、装置、电子设备及存储介质
CN113343919A (zh) * 2021-06-30 2021-09-03 中国铁道科学研究院集团有限公司 钢轨连续等距离擦硌伤的检测方法、装置、及计算机设备
CN113420610A (zh) * 2021-05-31 2021-09-21 湖南森鹰智造科技有限公司 基于毫米波与激光雷达融合的人体手势识别方法、电子设备、存储介质
CN113963441A (zh) * 2021-10-25 2022-01-21 中国科学技术大学 一种基于跨域增强的毫米波雷达手势识别方法及系统
CN115049039A (zh) * 2021-03-08 2022-09-13 北京金茂绿建科技有限公司 基于神经网络的状态识别方法、神经网络训练方法、装置

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113468913B (zh) * 2020-03-30 2022-07-05 阿里巴巴集团控股有限公司 数据处理、动作识别、模型训练方法、设备及存储介质
CN111539288B (zh) * 2020-04-16 2023-04-07 中山大学 一种双手姿势的实时检测方法
CN113900507A (zh) * 2020-07-06 2022-01-07 华为技术有限公司 手势识别方法和装置
CN111813224B (zh) * 2020-07-09 2022-03-25 电子科技大学 一种基于超高分辨率雷达精细手势库的建立与识别方法
CN112148769A (zh) * 2020-09-15 2020-12-29 浙江大华技术股份有限公司 数据的同步方法、装置、存储介质以及电子装置
CN112416128B (zh) * 2020-11-23 2022-07-01 森思泰克河北科技有限公司 一种手势识别方法及终端设备
CN114397963B (zh) * 2022-01-18 2023-06-30 深圳大学 一种手势识别方法、装置、电子设备及存储介质
CN116737019B (zh) * 2023-08-15 2023-11-03 山东泰克信息科技有限公司 一种智能显示屏感应识别控制管理系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170108575A1 (en) * 2014-06-30 2017-04-20 Xue Yang Efficient location determination of wireless communication devices using hybrid localization techniques
CN109032349A (zh) * 2018-07-10 2018-12-18 哈尔滨工业大学 一种基于毫米波雷达的手势识别方法及系统
CN109324317A (zh) * 2018-11-28 2019-02-12 深圳大学 毫米波雷达系统及其定位测速方法
CN109559525A (zh) * 2018-11-26 2019-04-02 厦门精益远达智能科技有限公司 一种基于毫米波雷达的超速监控方法、装置和设备
CN109583436A (zh) * 2019-01-29 2019-04-05 杭州朗阳科技有限公司 一种基于毫米波雷达的手势识别系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016110804A1 (en) * 2015-01-06 2016-07-14 David Burton Mobile wearable monitoring systems
CN105389003A (zh) * 2015-10-15 2016-03-09 广东欧珀移动通信有限公司 一种移动终端应用程序的控制方法和装置
CN107169411B (zh) * 2017-04-07 2019-10-29 南京邮电大学 一种基于关键帧和边界约束dtw的实时动态手势识别方法
CN107024685A (zh) * 2017-04-10 2017-08-08 北京航空航天大学 一种基于距离‑速度特征的手势识别方法
CN108664877A (zh) * 2018-03-09 2018-10-16 北京理工大学 一种基于三维深度数据的动态手势识别方法
CN109409255A (zh) * 2018-10-10 2019-03-01 长沙千博信息技术有限公司 一种手语场景生成方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170108575A1 (en) * 2014-06-30 2017-04-20 Xue Yang Efficient location determination of wireless communication devices using hybrid localization techniques
CN109032349A (zh) * 2018-07-10 2018-12-18 哈尔滨工业大学 一种基于毫米波雷达的手势识别方法及系统
CN109559525A (zh) * 2018-11-26 2019-04-02 厦门精益远达智能科技有限公司 一种基于毫米波雷达的超速监控方法、装置和设备
CN109324317A (zh) * 2018-11-28 2019-02-12 深圳大学 毫米波雷达系统及其定位测速方法
CN109583436A (zh) * 2019-01-29 2019-04-05 杭州朗阳科技有限公司 一种基于毫米波雷达的手势识别系统

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049039A (zh) * 2021-03-08 2022-09-13 北京金茂绿建科技有限公司 基于神经网络的状态识别方法、神经网络训练方法、装置
CN115049039B (zh) * 2021-03-08 2023-11-14 北京金茂绿建科技有限公司 基于神经网络的状态识别方法、神经网络训练方法、装置
CN113208566A (zh) * 2021-05-17 2021-08-06 深圳大学 一种数据处理方法、装置、电子设备及存储介质
CN113208566B (zh) * 2021-05-17 2023-06-23 深圳大学 一种数据处理方法、装置、电子设备及存储介质
CN113420610A (zh) * 2021-05-31 2021-09-21 湖南森鹰智造科技有限公司 基于毫米波与激光雷达融合的人体手势识别方法、电子设备、存储介质
CN113343919A (zh) * 2021-06-30 2021-09-03 中国铁道科学研究院集团有限公司 钢轨连续等距离擦硌伤的检测方法、装置、及计算机设备
CN113343919B (zh) * 2021-06-30 2024-03-15 中国铁道科学研究院集团有限公司 钢轨连续等距离擦硌伤的检测方法、装置、及计算机设备
CN113963441A (zh) * 2021-10-25 2022-01-21 中国科学技术大学 一种基于跨域增强的毫米波雷达手势识别方法及系统
CN113963441B (zh) * 2021-10-25 2024-04-02 中国科学技术大学 一种基于跨域增强的毫米波雷达手势识别方法及系统

Also Published As

Publication number Publication date
CN110741385B (zh) 2023-11-07
CN110741385A (zh) 2020-01-31

Similar Documents

Publication Publication Date Title
WO2020258106A1 (zh) 手势识别的方法和设备、定位追踪的方法和设备
Palipana et al. Pantomime: Mid-air gesture recognition with sparse millimeter-wave radar point clouds
Ling et al. Ultragesture: Fine-grained gesture sensing and recognition
US11061115B2 (en) Method for gesture recognition, terminal, and storage medium
Yu et al. QGesture: Quantifying gesture distance and direction with WiFi signals
JP6030184B2 (ja) 持続波超音波信号を使用したタッチレス感知およびジェスチャー認識
US9811168B2 (en) Apparatus for performing gesture recognition and control based on ultrasonic positioning
Chen et al. Human behavior recognition using Wi-Fi CSI: Challenges and opportunities
Ren et al. Hand gesture recognition using 802.11 ad mmWave sensor in the mobile device
US20220244367A1 (en) Measurements using an ultra-wideband ranging pair
Wang et al. CSI-based human sensing using model-based approaches: a survey
JP2018061114A (ja) 監視装置および監視方法
Kabir et al. CSI-IANet: An inception attention network for human-human interaction recognition based on CSI signal
Xiao et al. A survey on wireless device-free human sensing: Application scenarios, current solutions, and open issues
Pan et al. Dynamic hand gesture detection and recognition with WiFi signal based on 1d-CNN
CN113495267A (zh) 雷达天线阵列、移动终端、手势识别方法及装置
Zhang et al. Real-time and accurate gesture recognition with commercial RFID devices
Regani et al. Handwriting tracking using 60 GHz mmWave radar
Wang et al. Enabling non-invasive and real-time human-machine interactions based on wireless sensing and fog computing
An et al. Image-based positioning system using LED Beacon based on IoT central management
Huang et al. Ubiquitous WiFi and Acoustic Sensing: Principles, Technologies, and Applications
Jiang et al. Ctrack: acoustic device-free and collaborative hands motion tracking on smartphones
Abir et al. Contactless human activity analysis: An overview of different modalities
WO2021007733A1 (zh) 识别操作终端设备的手势的方法和终端设备
Liu et al. Acoustic-based 2-D target tracking with constrained intelligent edge device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19934559

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19934559

Country of ref document: EP

Kind code of ref document: A1