WO2023020268A1 - Procédé et appareil de reconnaissance de gestes, et dispositif et support - Google Patents

Procédé et appareil de reconnaissance de gestes, et dispositif et support Download PDF

Info

Publication number
WO2023020268A1
WO2023020268A1 PCT/CN2022/109467 CN2022109467W WO2023020268A1 WO 2023020268 A1 WO2023020268 A1 WO 2023020268A1 CN 2022109467 W CN2022109467 W CN 2022109467W WO 2023020268 A1 WO2023020268 A1 WO 2023020268A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
target hand
gesture recognition
hand
preset
Prior art date
Application number
PCT/CN2022/109467
Other languages
English (en)
Chinese (zh)
Inventor
李海洋
安龙飞
赵晓旭
颜世秦
侯俊杰
聂超
熊巧奇
张新田
王伟
杨文瀚
李进进
王照顺
刘高强
王鹏飞
慕岳衷
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2023020268A1 publication Critical patent/WO2023020268A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Definitions

  • the present disclosure relates to the technical field of image recognition, and in particular to a gesture recognition method, device, equipment and medium.
  • gesture recognition As an important part of human-computer interaction, gesture recognition has attracted widespread attention in more and more fields.
  • the method of gesture recognition usually uses sensors to extract hand features to obtain the position information corresponding to the gesture.
  • sensors to extract hand features to obtain the position information corresponding to the gesture.
  • there is interference due to different motion states which makes it impossible to accurately locate and recognize user gestures.
  • the present disclosure provides a gesture recognition method, device, equipment and medium.
  • An embodiment of the present disclosure provides a gesture recognition method, the method comprising:
  • the target image When it is determined based on the horizontal motion stabilization data that the target hand is stable in the horizontal direction, and based on the vertical distance between the target hand and a preset plane, it is determined that the target hand is stable in the vertical direction, then the target image The target hand performs gesture recognition.
  • An embodiment of the present disclosure also provides a gesture recognition device, the device comprising:
  • An image acquisition module configured to acquire a target image
  • the horizontal data module is used to determine the horizontal motion stability data of the target hand by performing motion recognition on the target image
  • a vertical data module configured to determine the vertical distance between the target hand and a preset plane
  • a gesture recognition module configured to determine that the target hand is stable in the horizontal direction based on the horizontal motion stabilization data, and determine that the target hand is stable in the vertical direction based on the vertical distance between the target hand and a preset plane, Then perform gesture recognition on the target hand of the target image.
  • An embodiment of the present disclosure also provides an electronic device, which includes: a processor; a memory for storing instructions executable by the processor; and the processor, for reading the instruction from the memory. Instructions can be executed, and the instructions are executed to implement the gesture recognition method provided by the embodiments of the present disclosure.
  • the embodiment of the present disclosure also provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to execute the gesture recognition method provided by the embodiment of the present disclosure.
  • the gesture recognition solution provided by the embodiment of the present disclosure acquires the target image, and determines the horizontal motion stability data of the target hand by performing motion recognition on the target image, Determine the vertical distance between the target hand and the preset plane; when the target hand is determined to be stable in the horizontal direction based on the horizontal motion stability data, and the target hand is determined to be stable in the vertical direction based on the vertical distance between the target hand and the preset plane, then the The target hand of the target image is used for gesture recognition.
  • the gesture recognition is performed after the hand is stable, so as to avoid the interference caused by the movement interference of the hand in the horizontal and/or vertical direction in the related technology. Larger error, thereby improving the accuracy of gesture recognition.
  • FIG. 1 is a schematic flowchart of a gesture recognition method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of another gesture recognition method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of gesture recognition provided by an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a gesture recognition device provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • the term “comprise” and its variations are open-ended, ie “including but not limited to”.
  • the term “based on” is “based at least in part on”.
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one further embodiment”; the term “some embodiments” means “at least some embodiments.” Relevant definitions of other terms will be given in the description below.
  • FIG. 1 is a schematic flow chart of a gesture recognition method provided by an embodiment of the present disclosure.
  • the method can be executed by a gesture recognition device, where the device can be implemented by software and/or hardware, and generally can be integrated into an electronic device.
  • the gesture recognition method in the embodiment of the present disclosure can be applied to any electronic device that needs gesture recognition, for example, the electronic device can be a mobile phone, a tablet computer, a desktop computer, a notebook computer, a vehicle terminal, a wearable device, an all-in-one machine , smart home devices and other devices with communication functions.
  • the method includes:
  • Step 101 acquiring a target image.
  • the target image may be an image including the current user's hand collected by a preset image collector, or an image frame including the current user's hand extracted from a video.
  • Target images can include extracted RGB image frames and depth images from videos.
  • Embodiments of the present disclosure are not limited to specific image collectors, and different types of image collectors are used to collect corresponding images, for example, depth image collectors are used to collect the above-mentioned depth images.
  • Step 102 Determine the horizontal movement stability data of the target hand by performing motion recognition on the target image.
  • the motion recognition may be the recognition of the motion state of the hand in the target image, specifically the recognition of the stability of the horizontal motion state, and the horizontal motion stability data may be the result of the recognition.
  • the target image includes the current RGB image frame and the previous RGB image frame extracted from the video
  • determining the horizontal motion stabilization data of the target hand includes: based on the current RGB image frame and For the previous RGB image frame, the optical flow algorithm is used to calculate the optical flow field of the current RGB image frame, and the optical flow field is thresholded to obtain the foreground area and background area including the target hand; the velocity vector of the foreground area and the background area The velocity vector of is determined as the horizontal motion stabilization data of the target hand.
  • Optical flow is the instantaneous velocity of the pixel movement of a spatially moving object on the observation imaging plane.
  • the time interval is small, it is approximately equivalent to the displacement of the target point.
  • the instantaneous rate of change of gray level at a specific coordinate point on a two-dimensional image plane is defined as the optical flow vector.
  • the optical flow method uses the changes of pixels in the image sequence in the time domain and the correlation between adjacent frames to find the corresponding relationship between the previous frame and the current frame, thereby calculating the motion of objects between adjacent frames.
  • Optical flow contains information about object motion.
  • the optical flow field is a two-dimensional vector field, which reflects the change trend of each gray point on the image, and can be regarded as the instantaneous velocity field generated by the movement of gray pixels on the image plane.
  • the information it contains is the instantaneous motion velocity vector information of each pixel.
  • an optical flow field corresponds to a motion field.
  • the gesture recognition device can use the optical flow algorithm for motion recognition. Specifically, after resampling and denoising preprocessing of the current RGB image frame and the previous RGB image frame, the optical flow method can be used to calculate the current RGB image frame. The optical flow value of each point is obtained, and then the optical flow field is thresholded to distinguish the foreground area and the background area. The foreground area includes the target hand. After that, the velocity vector of the foreground area and the velocity vector of the background area can be combined The velocity vector is determined as the horizontal motion stabilization data of the target hand.
  • the gesture recognition device can also use a continuous adaptive MeanShift (Continuously Adaptive Mean-SHIFT, CamShift) algorithm or an active contour tracking algorithm to perform motion recognition on the target image, and the specific process will not be repeated here.
  • a continuous adaptive MeanShift Continuous Adaptive Mean-SHIFT, CamShift
  • an active contour tracking algorithm to perform motion recognition on the target image, and the specific process will not be repeated here.
  • Step 103 determining the vertical distance between the target hand and the preset plane.
  • the preset plane may be a plane where the electronic device currently performing gesture recognition is located, and the preset plane may be a horizontal plane or a vertical plane, which is specifically determined according to an actual scene.
  • the preset plane is the horizontal plane where the electronic device is located; or, when the electronic device is placed vertically and gesture recognition needs to be performed on the user in front, the preset plane is The vertical plane where the electronic equipment is located, such as a vertical wall.
  • the vertical distance between the target hand and the preset plane can be determined in various ways, for example, it can be determined by extracting a depth image or based on a distance sensor, which is only an example and not a limitation.
  • the target image includes a first depth image at the first moment and a second depth image at the second moment
  • determining the vertical distance between the target hand and the preset plane includes: based on the first depth image and the second depth image, respectively extracting the first vertical distance and the second vertical distance between the target hand and the preset plane at the first moment and the second moment, and both the first depth image and the second depth image include the target hand and the preset plane.
  • the second moment is after the first moment, and there is a preset time interval between the first moment and the second moment, for example, 30 seconds may be detected.
  • Depth image is also called distance image, which refers to the image with the distance (depth) from the image collector to each point in the scene as the pixel value, which directly reflects the geometry of the visible surface of the object.
  • the gesture recognition device can acquire a first depth image and a second depth image including the target hand and a preset plane through the depth sensor, the first depth image corresponds to the first moment, and the second depth image corresponds to the second moment. Then, the first vertical distance and the second vertical distance between the preset point in the target hand and the preset plane at the first moment and the second moment can be respectively extracted from the first depth image and the second depth image.
  • the preset point can be set as a position point in the target hand, for example, the preset point can be the fingertip or palm center of any finger of the target hand.
  • determining the vertical distance between the target hand and the preset plane may include: using a distance sensor to respectively determine the first vertical distance and the second vertical distance between the target hand and the preset plane at the first moment and the second moment distance.
  • the distance sensor may be a sensor for sensing the distance between it and an object, and the distance sensor in the embodiment of the present disclosure may be set in the above-mentioned electronic device.
  • the gesture recognition device may also acquire the vertical distances between the target hand and the preset plane collected by the distance sensor at the first moment and the second moment.
  • Step 104 when it is determined that the target hand is stable in the horizontal direction based on the horizontal motion stabilization data, and it is determined that the target hand is stable in the vertical direction based on the vertical distance between the target hand and the preset plane, perform gesture recognition on the target hand in the target image .
  • gesture recognition is performed when the hand is determined to be stable in both horizontal and vertical directions.
  • determining that the target hand is stable in the horizontal direction based on the horizontal motion stabilization data may include: if the difference between the velocity vector of the foreground area and the velocity vector of the background area is less than a preset threshold, determining that the target hand is stable in the horizontal direction Steady direction.
  • the difference between the velocity vectors between the foreground area and the background area can be determined in the embodiment of the present disclosure value, and the difference is compared with the preset threshold, if the difference is less than the preset threshold, it is determined that the movement range of the target hand in the horizontal direction is very small, and it is considered stable.
  • determining that the target hand is stable in the vertical direction based on the vertical distance between the target hand and the preset plane includes: if the difference between the first vertical distance and the second vertical distance between the target hand and the preset plane If the value is smaller than the first preset difference, it is determined that the target hand is stable in the vertical direction.
  • the first preset difference can be set according to actual conditions, for example, the preset difference can be 1 cm.
  • the distance between the first vertical distance and the second vertical distance may be determined and compare the difference with the first preset difference. If the difference is smaller than the first preset difference, it means that the target hand is stable within a small distance in the vertical direction, and then the target is determined. The hand is stabilized in the vertical direction.
  • the gesture recognition device can determine whether the target hand is stable in the horizontal direction according to the horizontal movement stability data of the target hand, And judging whether the target hand is stable in the vertical direction according to the vertical distance between the target hand and the preset plane, if it is determined that the target hand is stable in both the horizontal direction and the vertical direction, gesture recognition can be performed on the target hand of the target image, specifically There may be various gesture recognition manners adopted, which are not limited in this embodiment of the present disclosure.
  • it before performing gesture recognition on the target hand of the target image, it further includes: judging whether the difference between the vertical distance between the target hand and the preset plane and the preset distance is smaller than the second preset difference; When the difference between the vertical distance between the target hand and the preset plane and the preset distance is smaller than the second preset difference, perform gesture recognition on the target hand in the target image.
  • the preset distance can be a preset recognition distance between a hand and a preset plane, which can be set according to actual use scenarios.
  • the preset distance can be Farther, for example, the preset distance may be 10 cm; and the preset distance may be shorter when performing gesture recognition in a point-and-read scene.
  • the difference between the vertical distance and the preset distance can be determined, and the difference can be compared with the second preset difference , if the difference is smaller than the second preset difference, it means that the target hand meets a distance requirement for gesture recognition, and then gesture recognition can be performed on the target image.
  • the second preset difference may be the same as or different from the above-mentioned first preset difference.
  • performing gesture recognition on the target hand of the target image may include: performing gesture segmentation and feature extraction on the target image, and then performing gesture recognition using a gesture recognition algorithm based on the extracted features.
  • the aforementioned preset gesture recognition algorithm may include a template matching algorithm, a statistical analysis algorithm, a neural network algorithm, etc., and is not specifically limited.
  • the gesture recognition device can perform gesture segmentation on the target image, specifically, threshold method, edge detection method, or physical feature method can be used to perform gesture segmentation; then feature extraction can be performed on the segmented gesture area, and the extracted features It can include contours, edges, image moments, image feature vectors, and regional histogram features, etc., and is not limited in detail; then based on the extracted features, a preset gesture recognition algorithm can be used for gesture recognition to obtain the final recognition result.
  • the gesture recognition scheme acquires the target image, and determines the horizontal motion stability data of the target hand by performing motion recognition on the target image, and determines the vertical distance between the target hand and the preset plane; when based on the horizontal motion stability data If it is determined that the target hand is stable in the horizontal direction, and based on the vertical distance between the target hand and the preset plane, it is determined that the target hand is stable in the vertical direction, and gesture recognition is performed on the target hand in the target image.
  • the gesture recognition is performed after the hand is stable, so as to avoid the interference caused by the movement interference of the hand in the horizontal and/or vertical direction in the related technology. Larger error, thereby improving the accuracy of gesture recognition.
  • FIG. 2 is a schematic flowchart of another gesture recognition method provided by an embodiment of the present disclosure. On the basis of the above embodiments, this embodiment further specifically describes the above gesture recognition method. As shown in Figure 2, the method includes:
  • Step 201 acquiring a target image.
  • the target image may include an RGB image frame and a depth image.
  • step 202-step 203 can be executed first, and then step 204-step 205 can be executed; step 204-step 205 can also be executed first; step 202-step 203 can be executed first; step 202-step 203 can also be executed first Step 202 and step 204 (the sequence is not limited), and then execute step 203 and step 205 (the sequence is not limited), which is determined according to the actual situation.
  • the order of execution in Figure 2 is just an example.
  • Step 202 Determine the horizontal motion stability data of the target hand by performing motion recognition on the target image.
  • the target image includes the current RGB image frame and the previous RGB image frame extracted from the video
  • the horizontal motion stabilization data of the target hand is determined by performing motion recognition on the target image, including: based on the current RGB image frame and the previous RGB image frame RGB image frame, using the optical flow algorithm to calculate the optical flow field of the current RGB image frame, and thresholding the optical flow field to obtain the foreground area and background area including the target hand; the velocity vector of the foreground area and the velocity of the background area The vector is determined as the horizontal motion stabilization data of the target hand.
  • Step 203 determine whether the target hand is stable in the horizontal direction based on the horizontal motion stability data, if yes, execute step 204 ; otherwise, return to execute step 201 .
  • step 204 if the difference between the velocity vector of the foreground area and the velocity vector of the background area is less than the preset threshold, it is determined that the target hand is stable in the horizontal direction, and step 204 is performed; if the velocity vector of the foreground area and the velocity vector of the background area If the difference is greater than or equal to the preset threshold, it is determined that the target hand is unstable in the horizontal direction, and the execution returns to step 201 .
  • Step 204 determining the vertical distance between the target hand and the preset plane.
  • the preset plane is a horizontal plane or a vertical plane.
  • the target image includes a first depth image at the first moment and a second depth image at the second moment
  • determining the vertical distance between the target hand and the preset plane includes: based on the first depth image and the second depth image, The first vertical distance and the second vertical distance between the target hand and the preset plane are respectively extracted at the first moment and the second moment, and both the first depth image and the second depth image include the target hand and the preset plane.
  • determining the vertical distance between the target hand and the preset plane includes: using a distance sensor to respectively determine a first vertical distance and a second vertical distance between the target hand and the preset plane at the first moment and the second moment.
  • Step 205 determine whether the target hand is stable in the vertical direction based on the vertical distance between the target hand and the preset plane, if yes, perform step 206 ; otherwise, return to step 201 .
  • step 206 can be performed; If the difference between the first vertical distance and the second vertical distance between the target hand and the preset plane is greater than or equal to the first preset difference, it is determined that the target hand is unstable in the vertical direction, and then the execution of step 201 may be returned .
  • Step 206 judging whether the difference between the vertical distance between the target hand and the preset plane and the preset distance is smaller than the second preset difference, if yes, go to step 207 ; otherwise, go back to step 201 .
  • Step 207 performing gesture recognition on the target hand in the target image.
  • performing gesture recognition on the target hand of the target image may include: performing gesture segmentation and feature extraction on the target image, and performing gesture recognition using a preset gesture recognition algorithm based on the extracted features.
  • FIG. 3 is a schematic diagram of a gesture recognition provided by an embodiment of the present disclosure.
  • the gesture recognition process may include: Step 21, start. Step 22, acquiring the RGB image frame and the depth image in the video. That is, the above-mentioned target image is acquired, and the target image includes an RGB image frame and a depth image.
  • Step 23 Based on the current RGB image frame and the previous RGB image frame, the optical flow algorithm is used to perform motion recognition on the current RGB image frame, and determine the horizontal motion stability data of the target hand.
  • Step 24 Determine whether the target hand is stable in the horizontal direction based on the horizontal movement stability data, if yes, execute step 25; otherwise, return to execute step 22. Step 25.
  • Step 26 Determine the vertical distance between the target hand and the preset plane based on the depth image.
  • Step 26 Determine whether the target hand is stable in the vertical direction based on the vertical distance between the target hand and the preset plane. If yes, perform step 27; otherwise, return to step 22.
  • Step 27 whether the vertical distance between the target hand and the preset plane reaches the preset distance, if yes, go to step 28; otherwise, go back to step 22.
  • step 28 is executed; otherwise, step 22 is executed.
  • gesture recognition When the target hand is stable in both the horizontal direction and the vertical direction, and the target hand reaches the preset distance from the preset plane, gesture recognition is started. Step 29, subsequent processing. Specifically, the gesture recognized in real time may be matched with a preset gesture, and if the matching is successful, the gesture recognition is completed. Step 30, end.
  • the horizontal motion recognition of the hand is carried out through the optical flow algorithm, and the vertical distance between the hand and the measured plane is determined based on the depth information, and then the gesture is performed when the target hand is stable in both the horizontal and vertical directions. Recognition, gesture recognition results with higher accuracy can be obtained.
  • the gesture recognition scheme acquires the target image, and determines the horizontal motion stability data of the target hand by performing motion recognition on the target image, and determines the vertical distance between the target hand and the preset plane; when based on the horizontal motion stability data If it is determined that the target hand is stable in the horizontal direction, and based on the vertical distance between the target hand and the preset plane, it is determined that the target hand is stable in the vertical direction, and gesture recognition is performed on the target hand in the target image.
  • the gesture recognition is performed after the hand is stable, so as to avoid the interference caused by the movement interference of the hand in the horizontal and/or vertical direction in the related technology. Larger error, thereby improving the accuracy of gesture recognition.
  • FIG. 4 is a schematic structural diagram of a gesture recognition device provided by an embodiment of the present disclosure.
  • the device can be implemented by software and/or hardware, and generally can be integrated into an electronic device. As shown in Figure 4, the device includes:
  • An image acquisition module 301 configured to acquire a target image
  • the horizontal data module 302 is used to determine the horizontal motion stability data of the target hand by performing motion recognition on the target image
  • a vertical data module 303 configured to determine the vertical distance between the target hand and a preset plane
  • Gesture recognition module 304 configured to determine that the target hand is stable in the horizontal direction based on the horizontal motion stability data, and determine that the target hand is stable in the vertical direction based on the vertical distance between the target hand and a preset plane , performing gesture recognition on the target hand of the target image.
  • the target image includes the current RGB image frame and the previous RGB image frame extracted from the video
  • the horizontal data module 302 is specifically used for:
  • an optical flow algorithm is used to calculate the optical flow field of the current RGB image frame, and the optical flow field is thresholded to obtain the target hand the foreground and background regions of
  • the velocity vector of the foreground area and the velocity vector of the background area are determined as the horizontal motion stabilization data of the target hand.
  • the gesture recognition module 304 is specifically configured to:
  • the difference between the velocity vector of the foreground area and the velocity vector of the background area is smaller than a preset threshold, it is determined that the target hand is stable in the horizontal direction.
  • the target image includes a first depth image at a first moment and a second depth image at a second moment
  • the vertical data module 303 is specifically used for:
  • the first Both the depth image and the second depth image include the target hand and the preset plane.
  • the vertical data module 303 is specifically used for:
  • a distance sensor is used to determine a first vertical distance and a second vertical distance between the target hand and the preset plane at the first moment and the second moment respectively.
  • the gesture recognition module 304 is specifically configured to:
  • the difference between the first vertical distance and the second vertical distance between the target hand and the preset plane is smaller than a first preset difference, it is determined that the target hand is stable in the vertical direction .
  • the preset plane is a horizontal plane or a vertical plane.
  • the device further includes a vertical judging module, configured to: before performing gesture recognition on the target hand of the target image,
  • the gesture recognition module 304 is specifically configured to:
  • gesture recognition is performed using a preset gesture recognition algorithm based on the extracted features.
  • the gesture recognition device provided by the embodiments of the present disclosure can execute the gesture recognition method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.
  • An embodiment of the present disclosure further provides a computer program product, including a computer program/instruction, and when the computer program/instruction is executed by a processor, the gesture recognition method provided in any embodiment of the present disclosure is implemented.
  • FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure. Referring specifically to FIG. 5 , it shows a schematic structural diagram of an electronic device 400 suitable for implementing an embodiment of the present disclosure.
  • the electronic device 400 in the embodiment of the present disclosure may include, but is not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (Personal Digital Assistants), PADs (Tablet Computers), PMPs (Portable Multimedia Players), vehicle-mounted terminals ( Mobile terminals such as car navigation terminals) and stationary terminals such as digital TVs, desktop computers and the like.
  • the electronic device shown in FIG. 5 is only an example, and should not limit the functions and scope of use of the embodiments of the present disclosure.
  • an electronic device 400 may include a processing device (such as a central processing unit, a graphics processing unit, etc.) 401, which may be randomly accessed according to a program stored in a read-only memory (ROM) 402 or loaded from a storage device 408. Various appropriate actions and processes are executed by programs in the memory (RAM) 403 . In the RAM 403, various programs and data necessary for the operation of the electronic device 400 are also stored.
  • the processing device 401 , ROM 402 and RAM 403 are connected to each other through a bus 404 .
  • An input/output (I/O) interface 405 is also connected to bus 404 .
  • the following devices can be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speaker, vibration an output device 407 such as a computer; a storage device 408 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 409.
  • the communication means 409 may allow the electronic device 400 to perform wireless or wired communication with other devices to exchange data. While FIG. 5 shows electronic device 400 having various means, it should be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product, which includes a computer program carried on a non-transitory computer readable medium, where the computer program includes program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from a network via communication means 409, or from storage means 408, or from ROM 402.
  • the processing device 401 When the computer program is executed by the processing device 401, the above-mentioned functions defined in the gesture recognition method of the embodiment of the present disclosure are executed.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted by any appropriate medium, including but not limited to wires, optical cables, RF (radio frequency), etc., or any suitable combination of the above.
  • the client and the server can communicate using any currently known or future network protocols such as HTTP (HyperText Transfer Protocol, Hypertext Transfer Protocol), and can communicate with digital data in any form or medium Communications (eg, communication networks) are interconnected.
  • Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), internetworks (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires the target image; determines the target hand by performing motion recognition on the target image The horizontal motion stabilization data; determine the vertical distance between the target hand and the preset plane; when it is determined based on the horizontal motion stabilization data that the target hand is stable in the horizontal direction, and based on the target hand and the preset plane If the vertical distance of the target hand is determined to be stable in the vertical direction, gesture recognition is performed on the target hand in the target image.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, or combinations thereof, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and Included are conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through an Internet service provider). Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of a unit does not constitute a limitation of the unit itself under certain circumstances.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs System on Chips
  • CPLD Complex Programmable Logical device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the present disclosure provides a gesture recognition method, including:
  • the target image When it is determined based on the horizontal motion stabilization data that the target hand is stable in the horizontal direction, and based on the vertical distance between the target hand and a preset plane, it is determined that the target hand is stable in the vertical direction, then the target image The target hand performs gesture recognition.
  • the target image includes the current RGB image frame and the previous RGB image frame extracted from the video, and by performing motion recognition on the target image, Determine horizontal motion stabilization data for the target hand, including:
  • an optical flow algorithm is used to calculate the optical flow field of the current RGB image frame, and the optical flow field is thresholded to obtain the target hand the foreground and background regions of
  • the velocity vector of the foreground area and the velocity vector of the background area are determined as the horizontal motion stabilization data of the target hand.
  • determining that the target hand is stable in the horizontal direction based on the horizontal motion stability data includes:
  • the difference between the velocity vector of the foreground area and the velocity vector of the background area is smaller than a preset threshold, it is determined that the target hand is stable in the horizontal direction.
  • the target image includes a first depth image at a first moment and a second depth image at a second moment, and the target hand and The vertical distance of the preset plane, including:
  • the first Both the depth image and the second depth image include the target hand and the preset plane.
  • determining the vertical distance between the target hand and the preset plane includes:
  • a distance sensor is used to determine a first vertical distance and a second vertical distance between the target hand and the preset plane at the first moment and the second moment respectively.
  • determining that the target hand is stable in the vertical direction based on the vertical distance between the target hand and a preset plane includes:
  • the difference between the first vertical distance and the second vertical distance between the target hand and the preset plane is smaller than a first preset difference, it is determined that the target hand is stable in the vertical direction .
  • the preset plane is a horizontal plane or a vertical plane.
  • the gesture recognition method provided in the present disclosure before performing gesture recognition on the target hand of the target image, further includes:
  • performing gesture recognition on the target hand of the target image includes:
  • gesture recognition is performed using a preset gesture recognition algorithm based on the extracted features.
  • the present disclosure provides a gesture recognition device, including:
  • An image acquisition module configured to acquire a target image
  • the horizontal data module is used to determine the horizontal motion stability data of the target hand by performing motion recognition on the target image
  • a vertical data module configured to determine the vertical distance between the target hand and a preset plane
  • the gesture recognition module is used to determine that the target hand is stable in the horizontal direction based on the horizontal motion stability data, and determine that the target hand is stable in the vertical direction based on the vertical distance between the target hand and a preset plane, Then perform gesture recognition on the target hand of the target image.
  • the target image includes the current RGB image frame and the previous RGB image frame extracted from the video
  • the horizontal data module is specifically used for:
  • an optical flow algorithm is used to calculate the optical flow field of the current RGB image frame, and the optical flow field is thresholded to obtain the target hand the foreground and background regions of
  • the velocity vector of the foreground area and the velocity vector of the background area are determined as the horizontal motion stabilization data of the target hand.
  • the gesture recognition module is specifically used for:
  • the difference between the velocity vector of the foreground area and the velocity vector of the background area is smaller than a preset threshold, it is determined that the target hand is stable in the horizontal direction.
  • the target image includes a first depth image at a first moment and a second depth image at a second moment
  • the vertical data module is specifically used At:
  • the first Both the depth image and the second depth image include the target hand and the preset plane.
  • the vertical data module is specifically used for:
  • a distance sensor is used to determine a first vertical distance and a second vertical distance between the target hand and the preset plane at the first moment and the second moment respectively.
  • the gesture recognition module is specifically used for:
  • the difference between the first vertical distance and the second vertical distance between the target hand and the preset plane is smaller than a first preset difference, it is determined that the target hand is stable in the vertical direction .
  • the preset plane is a horizontal plane or a vertical plane.
  • the device further includes a vertical judgment module, configured to: before performing gesture recognition on the target hand of the target image,
  • the gesture recognition module is specifically used for:
  • gesture recognition is performed using a preset gesture recognition algorithm based on the extracted features.
  • the present disclosure provides an electronic device, including:
  • the processor is configured to read the executable instructions from the memory, and execute the instructions to implement any gesture recognition method provided in the present disclosure.
  • the present disclosure provides a computer-readable storage medium, the storage medium stores a computer program, and the computer program is used to perform any of the gestures provided in the present disclosure recognition methods.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • User Interface Of Digital Computer (AREA)
  • Image Analysis (AREA)

Abstract

Selon les modes de réalisation, la présente invention concerne un procédé et un appareil de reconnaissance de gestes, ainsi qu'un dispositif et un support. Le procédé comprend les étapes suivantes : acquérir une image de cible ; déterminer des données de stabilité de mouvement horizontal d'une main cible au moyen d'une reconnaissance de mouvement sur l'image de cible ; déterminer la distance verticale entre la main cible et un plan prédéfini ; et lorsqu'il est déterminé, en fonction des données de stabilité de mouvement horizontal, que la main cible est stable dans la direction horizontale, et qu'il est déterminé, en fonction de la distance verticale entre la main cible et le plan prédéfini, que la main cible est stable dans la direction verticale, effectuer une reconnaissance de gestes sur la main cible de l'image de cible. Au moyen de la solution technique, la stabilité dans la direction horizontale et dans la direction verticale est déterminée avant que la reconnaissance de gestes ne soit effectuée, et la reconnaissance de gestes est effectuée après qu'une main est devenue stable, ce qui évite une erreur relativement importante provoquée par une interférence de mouvement de la main dans la direction horizontale et/ou verticale dans l'état de la technique, et améliore ainsi la précision de la reconnaissance de gestes.
PCT/CN2022/109467 2021-08-20 2022-08-01 Procédé et appareil de reconnaissance de gestes, et dispositif et support WO2023020268A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110962932.4 2021-08-20
CN202110962932.4A CN113642493B (zh) 2021-08-20 2021-08-20 一种手势识别方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2023020268A1 true WO2023020268A1 (fr) 2023-02-23

Family

ID=78423231

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/109467 WO2023020268A1 (fr) 2021-08-20 2022-08-01 Procédé et appareil de reconnaissance de gestes, et dispositif et support

Country Status (2)

Country Link
CN (1) CN113642493B (fr)
WO (1) WO2023020268A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642493B (zh) * 2021-08-20 2024-02-09 北京有竹居网络技术有限公司 一种手势识别方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970701A (zh) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 一种手势变化识别方法
CN107589850A (zh) * 2017-09-26 2018-01-16 深圳睛灵科技有限公司 一种手势移动方向的识别方法及系统
US20190034714A1 (en) * 2016-02-05 2019-01-31 Delphi Technologies, Llc System and method for detecting hand gestures in a 3d space
CN112306235A (zh) * 2020-09-25 2021-02-02 北京字节跳动网络技术有限公司 一种手势操作方法、装置、设备和存储介质
CN113642493A (zh) * 2021-08-20 2021-11-12 北京有竹居网络技术有限公司 一种手势识别方法、装置、设备及介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100748174B1 (ko) * 2005-12-27 2007-08-09 엠텍비젼 주식회사 동영상의 손떨림 검출 및 보정 장치
CN102779268B (zh) * 2012-02-06 2015-04-22 西南科技大学 基于方向运动历史图及竞争机制的手挥运动方向判定方法
KR101593950B1 (ko) * 2014-05-28 2016-02-15 숭실대학교산학협력단 손동작 기반의 인터페이스 장치 및 이를 이용한 포인팅 방법
CN112464833A (zh) * 2020-12-01 2021-03-09 平安科技(深圳)有限公司 基于光流的动态手势识别方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106970701A (zh) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 一种手势变化识别方法
US20190034714A1 (en) * 2016-02-05 2019-01-31 Delphi Technologies, Llc System and method for detecting hand gestures in a 3d space
CN107589850A (zh) * 2017-09-26 2018-01-16 深圳睛灵科技有限公司 一种手势移动方向的识别方法及系统
CN112306235A (zh) * 2020-09-25 2021-02-02 北京字节跳动网络技术有限公司 一种手势操作方法、装置、设备和存储介质
CN113642493A (zh) * 2021-08-20 2021-11-12 北京有竹居网络技术有限公司 一种手势识别方法、装置、设备及介质

Also Published As

Publication number Publication date
CN113642493B (zh) 2024-02-09
CN113642493A (zh) 2021-11-12

Similar Documents

Publication Publication Date Title
CN109584276B (zh) 关键点检测方法、装置、设备及可读介质
WO2018177379A1 (fr) Reconnaissance de geste, commande de geste et procédés et appareils d'apprentissage de réseau neuronal, et dispositif électronique
US10204423B2 (en) Visual odometry using object priors
US10891473B2 (en) Method and device for use in hand gesture recognition
CN108198044B (zh) 商品信息的展示方法、装置、介质及电子设备
EP3754542A1 (fr) Procédé et appareil de reconnaissance d'écriture manuscrite dans l'air ainsi que dispositif et support d'informations lisible par ordinateur
US11069365B2 (en) Detection and reduction of wind noise in computing environments
WO2022237811A1 (fr) Procédé et appareil de traitement d'image et dispositif
US9224064B2 (en) Electronic device, electronic device operating method, and computer readable recording medium recording the method
CN110163171B (zh) 用于识别人脸属性的方法和装置
CN110660102B (zh) 基于人工智能的说话人识别方法及装置、系统
JP7181375B2 (ja) 目標対象の動作認識方法、装置及び電子機器
KR20120044484A (ko) 이미지 처리 시스템에서 물체 추적 장치 및 방법
US10846565B2 (en) Apparatus, method and computer program product for distance estimation between samples
WO2022105622A1 (fr) Procédé et appareil de segmentation d'image, support lisible et dispositif électronique
WO2023020268A1 (fr) Procédé et appareil de reconnaissance de gestes, et dispositif et support
CN112488095A (zh) 印章图像识别方法、装置和电子设备
WO2013001144A1 (fr) Procédé et appareil de poursuite d'un visage par utilisation de projections de gradient intégrales
CN112306235A (zh) 一种手势操作方法、装置、设备和存储介质
WO2022194145A1 (fr) Procédé et appareil de détermination de position de photographie, dispositif et support
WO2022194158A1 (fr) Procédé et appareil de suivi de cible, dispositif et support
CN116659646A (zh) 一种基于机器视觉的风机叶片振动检测方法及装置
CN111310595A (zh) 用于生成信息的方法和装置
WO2022194157A1 (fr) Procédé et appareil de suivi de cible, dispositif et support
CN111209050A (zh) 用于切换电子设备的工作模式的方法和装置

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE