WO2019153379A1 - 实现手势操作的方法和装置 - Google Patents

实现手势操作的方法和装置 Download PDF

Info

Publication number
WO2019153379A1
WO2019153379A1 PCT/CN2018/077454 CN2018077454W WO2019153379A1 WO 2019153379 A1 WO2019153379 A1 WO 2019153379A1 CN 2018077454 W CN2018077454 W CN 2018077454W WO 2019153379 A1 WO2019153379 A1 WO 2019153379A1
Authority
WO
WIPO (PCT)
Prior art keywords
gradient direction
direction histogram
image
gesture
implementing
Prior art date
Application number
PCT/CN2018/077454
Other languages
English (en)
French (fr)
Inventor
王声平
张立新
Original Assignee
深圳市沃特沃德股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市沃特沃德股份有限公司 filed Critical 深圳市沃特沃德股份有限公司
Publication of WO2019153379A1 publication Critical patent/WO2019153379A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language

Definitions

  • the present invention relates to the field of electronic technologies, and in particular, to a method and apparatus for implementing gesture operations.
  • the gesture operation mode has begun to be applied to terminal devices.
  • the general process of the gesture operation is: the user performs a preset gesture action in front of the terminal device, and the terminal device recognizes the gesture action of the human body, and translates the gesture action into a corresponding operation instruction and executes.
  • the user can control the terminal device at a long distance without approaching and touching the terminal device, and brings a novel operation experience to the user.
  • the existing terminal device can only recognize one person's gesture action at the same time.
  • the terminal device cannot recognize the gesture of multiple people, and thus cannot simultaneously respond to the gesture operation of multiple people. Therefore, it cannot be applied to an application scenario of multi-person operation, so that the application range of the gesture operation is limited.
  • a main object of the present invention is to provide a method and apparatus for implementing a gesture operation, which aims to solve the technical problem that an existing terminal device cannot simultaneously respond to a gesture operation of a plurality of people, and expands the application range of the gesture operation.
  • the embodiment of the present invention provides a method for implementing a gesture operation, where the method includes the following steps:
  • the corresponding operation instruction is executed according to the gesture action of each human body.
  • the steps of separately identifying gesture actions of each human body include:
  • Gesture recognition is performed on each area, and the recognized gestures of the respective areas are used as gestures of the respective human bodies.
  • the step of performing human body detection comprises: performing human body detection based on a gradient direction histogram.
  • the step of performing human body detection based on the gradient direction histogram comprises:
  • the step of performing human body detection based on the gradient direction histogram comprises:
  • the gradient direction histogram of each sub-window is composed into a human body feature vector.
  • the step of calculating a gradient direction histogram of each cell in the image includes:
  • a gradient direction histogram of all pixels in each cell within the image is counted.
  • the step of counting a gradient direction histogram of all pixels in each cell in the image includes:
  • a weighted voting calculation is performed according to the gradient direction of each pixel in the cell, and a gradient direction histogram of all the pixels in the cell is obtained.
  • the weight of each pixel is the gradient magnitude of the pixel.
  • the step of performing a weighted voting calculation according to a gradient direction of each pixel in the cell includes:
  • Weighted voting calculations are performed using trilinear difference values.
  • N 4.
  • An embodiment of the present invention further provides an apparatus for implementing a gesture operation, where the apparatus includes:
  • a detection module configured to perform human body detection
  • the identification module is configured to separately recognize gesture actions of each human body when at least two human bodies are detected;
  • the execution module is configured to execute a corresponding operation instruction according to the gesture action of each human body.
  • the identifying module includes:
  • a first dividing unit configured to divide the detected different human bodies into different regions
  • the gesture recognition unit is configured to separately perform gesture recognition on each area, and use the recognized gestures of the respective areas as gesture gestures of the respective human bodies.
  • the detecting module is configured to perform human body detection based on the gradient direction histogram.
  • the detecting module includes:
  • a first calculating unit configured to perform a stepwise calculation on the image in the detection window
  • a second calculating unit configured to calculate a gradient direction histogram of each cell in the image
  • a first processing unit configured to perform normalization processing on all cells in each block in the image to obtain a gradient direction histogram of the block
  • the second processing unit is configured to perform normalization processing on all the blocks in the image to obtain a gradient direction histogram of the detection window, and use a gradient direction histogram of the detection window as a human body feature vector.
  • the detecting module includes:
  • a second dividing unit configured to divide the detection window into N sub-windows, N ⁇ 2;
  • a third calculating unit configured to perform a stepwise calculation on the images in each sub-window
  • a fourth calculating unit configured to calculate a gradient direction histogram of each cell in the image in each sub-window
  • a third processing unit configured to normalize all cells in each block in the image in each sub-window to obtain a gradient direction histogram of the block
  • a fourth processing unit configured to perform normalization processing on all blocks in the image in each sub-window to obtain a gradient direction histogram of the sub-window
  • the combining unit is configured to form a gradient direction histogram of each sub-window into a human body feature vector.
  • the second calculating unit includes:
  • Calculating a subunit configured to calculate a gradient of each pixel within the image
  • a statistical subunit arranged to count a gradient direction histogram of all pixels in each cell within the image.
  • the statistical subunit includes:
  • Divide subunits set to divide [0 ⁇ ] into multiple intervals for each cell
  • the weighting calculation subunit is configured to perform a weighted voting calculation according to a gradient direction of each pixel in the cell to obtain a gradient direction histogram of all the pixels in the cell.
  • the weighting calculation subunit performs a weighted voting calculation using a three linear difference value.
  • Embodiments of the present invention also provide an apparatus for implementing gesture operations, including a memory, a processor, and at least one application stored in the memory and configured to be executed by the processor, the application being configured It is a method for performing the aforementioned implementation of a gesture operation.
  • a method for implementing a gesture operation by performing human body detection, when at least two human bodies are detected, respectively, identifying gesture actions of each human body, and executing corresponding operation instructions according to gesture actions of the respective human bodies Thereby, the gesture of recognizing multiple people at the same time is realized, and the gesture operation of the multi-person can be responded, so that the gesture operation can be applied to the application scenario of multi-person operation, and the application range of the gesture operation is expanded.
  • FIG. 1 is a flow chart of an embodiment of a method for implementing a gesture operation according to the present invention
  • FIG. 2 is a specific flowchart of a step of performing human body detection in an embodiment of the present invention
  • FIG. 4 is a block diagram of an embodiment of an apparatus for implementing gesture operations according to the present invention.
  • FIG. 5 is a block diagram of the detection module of Figure 4.
  • Figure 6 is a block diagram of the second computing unit of Figure 5;
  • Figure 7 is a block diagram of the statistical subunit of Figure 6;
  • FIG. 8 is another block diagram of the detection module of FIG. 4;
  • FIG. 9 is a block diagram of the identification module of FIG. 4.
  • terminal and terminal device used herein include both a wireless signal receiver device, a device having only a wireless signal receiver without a transmitting capability, and a receiving and transmitting hardware.
  • Such devices may include cellular or other communication devices having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service), which may combine voice, data Processing, fax, and/or data communication capabilities; PDA (Personal Digital Assistant), which can include radio frequency receivers, pagers, Internet/Intranet access, web browsers, notepads, calendars, and/or GPS (Global Positioning System (Global Positioning System) receiver; conventional laptop and/or palmtop computer or other device having a conventional laptop and/or palmtop computer or other device that includes and/or includes a radio frequency receiver.
  • PCS Personal Communications Service
  • PDA Personal Digital Assistant
  • terminal may be portable, transportable, installed in a vehicle (aviation, sea and/or land), or adapted and/or configured to operate locally, and/or Run in any other location on the Earth and/or space in a distributed form.
  • the "terminal” and “terminal device” used herein may also be a communication terminal, an internet terminal, a music/video playing terminal, and may be, for example, a PDA, a MID (Mobile Internet Device), and/or have a music/video playback.
  • Functional mobile phones can also be smart TVs, set-top boxes and other devices.
  • the method and device for implementing gesture operations in the embodiments of the present invention can be applied to various terminal devices, including fixed terminals such as game machines, televisions, and personal computers, mobile terminals such as mobile phones and tablets, and the like.
  • the method includes the following steps:
  • an image such as a histogram of oriented gradient (HOG), a scale-invariant feature transform (SIFT), a local binary pattern (LBP), a HARR, or the like may be used.
  • HOG histogram of oriented gradient
  • SIFT scale-invariant feature transform
  • LBP local binary pattern
  • HARR HARR
  • the gradient direction histogram is a local descriptor similar to the scale invariant feature transformation, which constructs the human body feature by calculating the gradient direction histogram on the local region.
  • scale-invariant feature transformation is based on feature extraction of key points, which is a sparse description method, and gradient direction histogram is a dense description method.
  • the gradient direction histogram description method has the following advantages: the gradient direction histogram represents the structural features of the edge (gradient), so that local shape information can be described; the quantization of the position and direction space can suppress the translation and rotation bands to a certain extent. The impact of the coming; at the same time the normalization in the local area can partially offset the impact of lighting. Therefore, embodiments of the present invention preferably perform human body detection based on a gradient direction histogram.
  • the specific process of performing human body detection based on the gradient direction histogram in the embodiment of the present invention is as follows:
  • a detection window of a normalized size (such as 64x128) is detected (Detection Window)
  • Detection Window As an input, the gradient in the horizontal and vertical directions of the image in the detection window is calculated by the first-order (one-dimensional) Sobel operator [-1, 0, 1].
  • the advantage of using a single window as a classifier input is that the classifier is invariant to the position and scale of the target.
  • the detection window needs to be moved in the horizontal and vertical directions, and the image is scaled at multiple scales to detect the human body at different scales.
  • the gradient direction histogram is obtained by intensive calculations in a grid called a cell and a block.
  • the image is divided into cells, each cell consisting of multiple pixels, and the block is composed of several adjacent cells.
  • the gradient of each pixel in the image is first calculated, and then the gradient direction histogram of all the pixels in each cell in the image, that is, the gradient direction histogram of the cell is calculated.
  • the gradient direction histogram of each cell When counting the gradient direction histogram of each cell, first divide [0 ⁇ ] into multiple intervals for each cell, and then perform weighted voting calculation according to the gradient direction of each pixel in the cell to obtain the cell.
  • the weight of each pixel is preferably the gradient magnitude of the pixel. In order to eliminate confusion, it is preferred to perform a weighted voting calculation using a Trilinear Interpolation.
  • the gradient direction histogram of the cells in the block is normalized to eliminate the influence of illumination, thereby obtaining a gradient direction histogram of the block.
  • Each block in the image is traversed to obtain a gradient direction histogram for each block in the image.
  • the gradient direction histogram of the detection window obtained by normalization of each block constitutes a human body feature vector, thereby realizing human body detection.
  • the human body detection method shown in FIG. 3 can be used for human body detection, and specifically includes the following steps:
  • S204 Normalize all cells in each block in the image in each sub-window to obtain a gradient direction histogram of the block.
  • the gradient direction histogram of each sub-window is formed into a human body feature vector.
  • step S201 the detection window is divided into N (N ⁇ 2) sub-windows, for example, the four key areas of the head region, the left arm region, the right arm region and the leg region of the human body in the detection window are used as sub-windows, which are to be detected.
  • the window is divided into 4 sub-windows.
  • steps S202-S205 for each sub-window, the gradient direction histogram of each sub-window is calculated in the same manner as the first scheme.
  • step S206 the gradient direction histogram of each sub-window is composed into the final human body feature vector.
  • the detected different human bodies are first divided into different regions, and then each region is separately gesture-recognized, and the recognized gestures of the respective regions are used as individual human bodies. Gesture action.
  • the gesture actions of the respective human bodies are translated into corresponding operation instructions, and each operation instruction is separately executed.
  • the recognition of the multi-person gesture is realized, and the gesture operation of the multi-person can be simultaneously responded.
  • the method for implementing the gesture operation in the embodiment of the present invention by performing the human body detection, when at least two human bodies are detected, respectively, the gesture actions of the respective human bodies are respectively recognized, and corresponding operation instructions are executed according to the gesture actions of the respective human bodies, thereby achieving simultaneous
  • the multi-person gesture is recognized, and the multi-person gesture operation can be responded, so that the gesture operation can be applied to the multi-person operation application scenario, and the application range of the gesture operation is expanded.
  • the apparatus includes a detection module 10, an identification module 20, and an execution module 30, wherein: the detection module 10 is configured to perform human body detection; and the identification module 20 is configured to When at least two human bodies are detected, the gesture actions of the respective human bodies are respectively identified; and the execution module 30 is configured to execute corresponding operation instructions according to the gesture actions of the respective human bodies.
  • the detection module 10 can be based on a histogram of oriented gradient (HOG), a scale-invariant feature transform (SIFT), and a local binary pattern (LBP).
  • HOG histogram of oriented gradient
  • SIFT scale-invariant feature transform
  • LBP local binary pattern
  • the gradient direction histogram is a local descriptor similar to the scale invariant feature transformation, which constructs the human body feature by calculating the gradient direction histogram on the local region.
  • scale-invariant feature transformation is based on feature extraction of key points, which is a sparse description method, and gradient direction histogram is a dense description method.
  • the gradient direction histogram description method has the following advantages: the gradient direction histogram represents the structural features of the edge (gradient), so that local shape information can be described; the quantization of the position and direction space can suppress the translation and rotation bands to a certain extent. The impact of the coming; at the same time the normalization in the local area can partially offset the impact of lighting. Therefore, embodiments of the present invention preferably perform human body detection based on a gradient direction histogram.
  • the detecting module 10 includes a first calculating unit 11, a second calculating unit 12, a first processing unit 13, and a second processing unit 14, wherein: the first calculating unit 11 is arranged to detect an image in the window Performing a step calculation; the second calculating unit 12 is configured to calculate a gradient direction histogram of each cell in the image; the first processing unit 13 is configured to normalize all cells in each block in the image Obtaining a gradient direction histogram of the block; the second processing unit 14 is configured to normalize all the blocks in the image to obtain a gradient direction histogram of the detection window, and use the gradient direction histogram of the detection window as a human body feature vector.
  • the first calculating unit 11 takes a detection window of a normalized size (such as 64 ⁇ 128) as an input, and calculates a detection window by using a first-order (one-dimensional) Sobel operator [-1, 0, 1]. The gradient of the image in the horizontal and vertical directions.
  • the advantage of using a single window as a classifier input is that the classifier is invariant to the position and scale of the target.
  • the detection window needs to be moved in the horizontal and vertical directions, and the image is scaled at multiple scales to detect the human body at different scales.
  • the gradient direction histogram is obtained by intensive calculations in a grid called a cell and a block.
  • the image is divided into cells, each cell consisting of multiple pixels, and the block is composed of several adjacent cells.
  • the second calculating unit 12 includes a calculating subunit 121 and a statistic subunit 122, as shown in FIG. 6, wherein: the calculating subunit 121 is configured to calculate a gradient of each pixel in the image; the statistical subunit 122 , set to count the gradient direction histogram of all the pixels in each cell in the image, that is, the gradient direction histogram of the cell.
  • the statistical sub-unit 122 includes a dividing sub-unit 1221 and a weighting calculating sub-unit 1222121, wherein: the dividing sub-unit 1221 is configured to divide [0 ⁇ ] into a plurality of sections for each cell;
  • the calculation sub-unit 1222121 is configured to perform a weighted voting calculation according to the gradient direction of each pixel in the cell to obtain a gradient direction histogram of all the pixels in the cell.
  • the weighting calculation sub-unit 1222121 when performing the weighted voting calculation, the weight of each pixel is preferably the gradient magnitude of the pixel. To eliminate confusion, the weighting calculation sub-unit 1222121 preferably performs a weighted voting calculation using a trilinear interpolation (Trilinear Interpolation).
  • Trilinear Interpolation Trilinear Interpolation
  • the weighting calculation sub-unit 1222121 traverses each cell in the image to obtain a gradient direction histogram of each cell in the image.
  • the first processing unit 13 normalizes the gradient direction histogram of the cells within the block to eliminate the influence of illumination, thereby obtaining a gradient direction histogram of the block.
  • the first processing unit 13 traverses each block in the image to obtain a gradient direction histogram for each block in the image.
  • the second processing unit 14 normalizes the gradient direction histogram of the detection window obtained by normalizing each block to form a human body feature vector, thereby realizing human body detection.
  • the gradient direction histogram is a dense calculation method, the amount of calculation is large. In order to reduce the amount of calculation and increase the detection speed, it is considered to select a gradient direction histogram in a key area with a relatively obvious human contour, thereby achieving the purpose of reducing the dimension.
  • the detecting module 10 can also include a second dividing unit 101, a third calculating unit 102, a fourth calculating unit 103, a third processing unit 104, a fourth processing unit 105, and a combining unit 106, as shown in FIG.
  • the second dividing unit 101 is configured to divide the detection window into N sub-windows, N ⁇ 2; the third calculating unit 102 is configured to perform a stepwise calculation on the images in each sub-window; and the fourth calculating unit 103 is configured to Calculating a gradient direction histogram of each cell in the image in each sub-window; the third processing unit 104 is configured to normalize all cells in each block in the image in each sub-window to obtain a block a gradient direction histogram; the fourth processing unit 105 is configured to normalize all the blocks in the image in each sub-window to obtain a gradient direction histogram of the sub-window; the combining unit 106 is configured to set each sub-window
  • the gradient direction histogram is composed of a human body feature vector.
  • the second dividing unit 101 divides the detection window into four sub-windows, for example, the four key areas of the head region, the left arm region, the right arm region, and the leg region of the human body in the detection window are used as sub-windows.
  • the third calculating unit 102 performs a stepwise calculation on the images in the respective sub-windows in the same manner as the first calculating unit 11.
  • the fourth calculating unit 103 calculates a gradient direction histogram of each cell in the image in each sub-window in the same manner as the second calculating unit 12.
  • the third processing unit 104 normalizes all cells in each block in the image within each sub-window in the same manner as the first processing unit 13.
  • the fourth processing unit 105 normalizes all the blocks in the image within each sub-window in the same manner as the second processing unit 14. Finally, the combination unit 106 combines the gradient direction histograms of the respective sub-windows to form the final human body feature vector.
  • the identification module 20 includes a first dividing unit 21 and a gesture recognition unit 22, as shown in FIG. 9, wherein: the first dividing unit 21 is configured to divide the detected different human bodies into different regions; The gesture recognition unit 22 is configured to perform gesture recognition on each of the regions, and to recognize the gestures of the respective regions as gesture gestures of the respective human bodies.
  • the execution module 30 translates the gesture actions of the respective human bodies into corresponding operation instructions according to the corresponding relationship between the gesture actions and the operation instructions, and respectively executes the respective operation instructions. Thereby, the recognition of the multi-person gesture is realized, and the gesture operation of the multi-person can be simultaneously responded.
  • the apparatus for implementing gesture operation when performing at least two human bodies by detecting the human body, respectively identifying the gesture actions of the respective human bodies, and executing corresponding operation instructions according to the gesture actions of the respective human bodies, thereby achieving simultaneous
  • the multi-person gesture is recognized, and the multi-person gesture operation can be responded, so that the gesture operation can be applied to the multi-person operation application scenario, and the application range of the gesture operation is expanded.
  • the present invention also provides an apparatus for implementing a gesture operation, comprising a memory, a processor, and at least one application stored in the memory and configured to be executed by the processor, the application being configured to perform an implementation gesture
  • the method of operation includes the following steps: performing human body detection; when at least two human bodies are detected, respectively identifying gesture actions of the respective human bodies; and performing corresponding operation instructions according to the gesture actions of the respective human bodies.
  • the method for implementing the gesture operation described in this embodiment is the method for implementing the gesture operation involved in the foregoing embodiment of the present invention, and details are not described herein again.
  • the present invention includes apparatus that is directed to performing one or more of the operations described herein. These devices may be specially designed and manufactured for the required purposes, or may also include known devices in a general purpose computer. These devices have computer programs stored therein that are selectively activated or reconfigured.
  • Such computer programs may be stored in a device (eg, computer) readable medium or in any type of medium suitable for storing electronic instructions and coupled to a bus, respectively, including but not limited to any Type of disk (including floppy disk, hard disk, CD, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory, read only memory), RAM (Random Access Memory, Random Memory), EPROM (Erasable) Programmable Read-Only Memory, EEPROM (Electrically Readable) Erasable Programmable Read-Only Memory, flash memory, magnetic card or light card.
  • a readable medium includes any medium that is stored or transmitted by a device (eg, a computer) in a readable form.
  • each block of the block diagrams and/or block diagrams and/or flow diagrams and combinations of blocks in the block diagrams and/or block diagrams and/or flow diagrams can be implemented by computer program instructions. .
  • these computer program instructions can be implemented by a general purpose computer, a professional computer, or a processor of other programmable data processing methods, such that the processor is executed by a computer or other programmable data processing method.
  • steps, measures, and solutions in the various operations, methods, and processes that have been discussed in the present invention may be alternated, changed, combined, or deleted. Further, other steps, measures, and schemes of the various operations, methods, and processes that have been discussed in the present invention may be alternated, modified, rearranged, decomposed, combined, or deleted. Further, the steps, measures, and solutions in the prior art having various operations, methods, and processes disclosed in the present invention may also be alternated, changed, rearranged, decomposed, combined, or deleted.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本发明揭示了一种实现手势操作的方法和装置,所述方法包括以下步骤:进行人体检测;当检测到至少两个人体时,分别识别各个人体的手势动作;根据各个人体的手势动作执行相应的操作指令,从而实现了同时识别多人的手势,并能够响应多人的手势操作,使得手势操作可以应用于多人操作的应用场景,扩展了手势操作的应用范围。

Description

实现手势操作的方法和装置 技术领域
本发明涉及电子技术领域,特别是涉及到一种实现手势操作的方法和装置。
背景技术
随着图像识别技术的发展,手势操作方式已开始应用于终端设备。手势操作的大致流程为:用户在终端设备面前做出预设的手势动作,终端设备识别出人体的手势动作,将手势动作翻译为对应的操作指令并执行。通过手势操作,用户无需靠近和触摸终端设备,远距离就能操控终端设备,给用户带来了新奇的操作体验。
然而,现有的终端设备在同一时间内只能识别一个人的手势动作,当有多个人同时做出手势动作,终端设备则无法识别多人的手势,进而不能同时响应多人的手势操作,因此无法应用于多人操作的应用场景,使得手势操作的应用范围受限。
技术问题
本发明的主要目的为提供一种实现手势操作的方法和装置,旨在解决现有的终端设备不能同时响应多人的手势操作的技术问题,扩展手势操作的应用范围。
技术解决方案
为达以上目的,本发明实施例提出一种实现手势操作的方法,所述方法包括以下步骤:
进行人体检测;
当检测到至少两个人体时,分别识别各个人体的手势动作;
根据各个人体的手势动作执行相应的操作指令。
可选地,所述分别识别各个人体的手势动作的步骤包括:
将检测到的不同的人体划分为不同的区域;
分别对各个区域进行手势识别,将识别出的各个区域的手势动作作为各个人体的手势动作。
可选地,所述进行人体检测的步骤包括:基于梯度方向直方图进行人体检测。
可选地,所述基于梯度方向直方图进行人体检测的步骤包括:
对检测窗口内的图像进行一阶梯度计算;
计算所述图像中各个单元格的梯度方向直方图;
对所述图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
对所述图像内的所有块进行归一化处理,得到所述检测窗口的梯度方向直方图,并将所述检测窗口的梯度方向直方图作为人体特征向量。
可选地,所述基于梯度方向直方图进行人体检测的步骤包括:
将检测窗口划分为N个子窗口,N≥2;
对各个子窗口内的图像进行一阶梯度计算;
计算每个子窗口内的图像中各个单元格的梯度方向直方图;
对每个子窗口内的图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
对每个子窗口内的图像中的所有块进行归一化处理,得到所述子窗口的梯度方向直方图;
将各子窗口的梯度方向直方图组成为人体特征向量。
可选地,所述计算所述图像内各个单元格的梯度方向直方图的步骤包括:
计算所述图像内每个像素的梯度;
统计出所述图像内每个单元格中所有像素的梯度方向直方图。
可选地,所述统计出所述图像内每个单元格中所有像素的梯度方向直方图的步骤包括:
针对每个单元格,将[0~π]划分为多个区间;
根据所述单元格内各像素的梯度方向进行加权投票计算,得到所述单元格中所有像素的梯度方向直方图。
可选地,加权投票计算时,每个像素的权重为所述像素的梯度幅度。
可选地,所述根据所述单元格内各像素的梯度方向进行加权投票计算的步骤包括:
采用三线性差值进行加权投票计算。
可选地,N=4。
本发明实施例同时提出一种实现手势操作的装置,所述装置包括:
检测模块,设置为进行人体检测;
识别模块,设置为当检测到至少两个人体时,分别识别各个人体的手势动作;
执行模块,设置为根据各个人体的手势动作执行相应的操作指令。
可选地,所述识别模块包括:
第一划分单元,设置为将检测到的不同的人体划分为不同的区域;
手势识别单元,设置为分别对各个区域进行手势识别,将识别出的各个区域的手势动作作为各个人体的手势动作。
可选地,所述检测模块设置为:基于梯度方向直方图进行人体检测。
可选地,所述检测模块包括:
第一计算单元,设置为对检测窗口内的图像进行一阶梯度计算;
第二计算单元,设置为计算所述图像中各个单元格的梯度方向直方图;
第一处理单元,设置为对所述图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
第二处理单元,设置为对所述图像内的所有块进行归一化处理,得到所述检测窗口的梯度方向直方图,并将所述检测窗口的梯度方向直方图作为人体特征向量。
可选地,所述检测模块包括:
第二划分单元,设置为将检测窗口划分为N个子窗口,N≥2;
第三计算单元,设置为对各个子窗口内的图像进行一阶梯度计算;
第四计算单元,设置为计算每个子窗口内的图像中各个单元格的梯度方向直方图;
第三处理单元,设置为对每个子窗口内的图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
第四处理单元,设置为对每个子窗口内的图像中的所有块进行归一化处理,得到所述子窗口的梯度方向直方图;
组合单元,设置为将各子窗口的梯度方向直方图组成为人体特征向量。
可选地,所述第二计算单元包括:
计算子单元,设置为计算所述图像内每个像素的梯度;
统计子单元,设置为统计出所述图像内每个单元格中所有像素的梯度方向直方图。
可选地,所述统计子单元包括:
划分子单元,设置为针对每个单元格,将[0~π]划分为多个区间;
加权计算子单元,设置为根据所述单元格内各像素的梯度方向进行加权投票计算,得到所述单元格中所有像素的梯度方向直方图。
可选地,所述加权计算子单元采用三线性差值进行加权投票计算。
本发明实施例还提出一种实现手势操作的装置,其包括存储器、处理器和至少一个被存储在所述存储器中并被配置为由所述处理器执行的应用程序,所述应用程序被配置为用于执行前述实现手势操作的方法。
有益效果
本发明实施例所提供的一种实现手势操作的方法,通过进行人体检测,当检测到至少两个人体时,则分别识别各个人体的手势动作,并根据各个人体的手势动作执行相应的操作指令,从而实现了同时识别多人的手势,并能够响应多人的手势操作,使得手势操作可以应用于多人操作的应用场景,扩展了手势操作的应用范围。
附图说明
图1是本发明实现手势操作的方法一实施例的流程图;
图2是本发明实施例中进行人体检测的步骤的具体流程图;
图3是本发明实施例中进行人体检测的步骤的另一具体流程图;
图4是本发明实现手势操作的装置一实施例的模块示意图;
图5是图4中的检测模块的模块示意图;
图6是图5中的第二计算单元的模块示意图;
图7是图6中的统计子单元的模块示意图;
图8是图4中的检测模块的又一模块示意图;
图9是图4中的识别模块的模块示意图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的最佳实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能解释为对本发明的限制。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
本技术领域技术人员可以理解,这里所使用的“终端”、“终端设备”既包括无线信号接收器的设备,其仅具备无发射能力的无线信号接收器的设备,又包括接收和发射硬件的设备,其具有能够在双向通信链路上,执行双向通信的接收和发射硬件的设备。这种设备可以包括:蜂窝或其他通信设备,其具有单线路显示器或多线路显示器或没有多线路显示器的蜂窝或其他通信设备;PCS(Personal Communications Service,个人通信系统),其可以组合语音、数据处理、传真和/或数据通信能力;PDA(Personal Digital Assistant,个人数字助理),其可以包括射频接收器、寻呼机、互联网/内联网访问、网络浏览器、记事本、日历和/或GPS(Global Positioning System,全球定位系统)接收器;常规膝上型和/或掌上型计算机或其他设备,其具有和/或包括射频接收器的常规膝上型和/或掌上型计算机或其他设备。这里所使用的“终端”、“终端设备”可以是便携式、可运输、安装在交通工具(航空、海运和/或陆地)中的,或者适合于和/或配置为在本地运行,和/或以分布形式,运行在地球和/或空间的任何其他位置运行。这里所使用的“终端”、“终端设备”还可以是通信终端、上网终端、音乐/视频播放终端,例如可以是PDA、MID(Mobile Internet Device,移动互联网设备)和/或具有音乐/视频播放功能的移动电话,也可以是智能电视、机顶盒等设备。
本发明实施例实现手势操作的方法和装置,可以应用于各种终端设备,所述终端设备包括游戏机、电视机、个人电脑等固定终端,手机、平板等移动终端,等等。
参照图1,提出本发明实现手势操作的方法一实施例,所述方法包括以下步骤:
S11、进行人体检测。
本步骤S11中,可以基于梯度方向直方图(Histogram of oriented gradient,HOG)、尺度不变特征转换(Scale-invariant feature transform,SIFT)、局部二值模式(Local Binary Pattern,LBP)、HARR等图像特征进行人体检测。
梯度方向直方图是类似于尺度不变特征转换的一种局域描述符,它通过计算局部区域上的梯度方向直方图来构成人体特征。与尺度不变特征转换不同的是,尺度不变特征转换是基于关键点的特征提取,是一种稀疏描述方法,而梯度方向直方图是密集的描述方法。
梯度方向直方图描述方法具有以下优点:梯度方向直方图表示的是边缘(梯度)的结构特征,因此可以描述局部的形状信息;位置和方向空间的量化,在一定程度上可以抑制平移和旋转带来的影响;同时采取在局部区域的归一化,可以部分抵消光照带来的影响。故本发明实施例优选基于梯度方向直方图进行人体检测。
如图2所示,本发明实施例基于梯度方向直方图进行人体检测的具体流程如下:
S101、对检测窗口内的图像进行一阶梯度计算。
本发明实施例中,将规范化大小(如64x128)的检测窗口(Detection Window)作为输入,通过一阶(一维)Sobel算子[-1,0,1]计算检测窗口内的图像水平和垂直方向上的梯度。
采用单一窗口作为分类器输入的好处是分类器对目标的位置与尺度具有不变性。对于一个待检测的输入图像来说,需要沿着水平和垂直方向移动检测窗口,同时要以多尺度对图像进行缩放以检测不同尺度下的人体。
S102、计算图像中各个单元格的梯度方向直方图。
梯度方向直方图是在被称为单元格(Cell)和块(Block)的网格内进行密集计算得到的。将图像分成若干单元格,每个单元格由多个像素构成,而块则是由若干相邻的单元格组成。
本发明实施例中,先计算图像内每个像素的梯度,再统计出图像内每个单元格中所有像素的梯度方向直方图,即该单元格的梯度方向直方图。在统计各个单元格的梯度方向直方图时,首先针对每个单元格将[0~π]划分为多个区间,然后根据该单元格内各像素的梯度方向进行加权投票计算,得到该单元格中所有像素的梯度方向直方图。
在进行加权投票计算时,每个像素的权重为优选为该像素的梯度幅度。为了消除混淆,优选采用三线性差值(Trilinear Interpolationi)进行加权投票计算。
遍历图像中的每个单元格,得到图像中各个单元格的梯度方向直方图。
S103、对图像中每个块内的所有单元格进行归一化处理,得到每个块的梯度方向直方图。
在块内,对该块内的单元格的梯度方向直方图进行归一化处理,以消除光照的影响,从而得到该块的梯度方向直方图。
遍历图像中的每个块,得到图像中每个块的梯度方向直方图。
S104、对图像内的所有块进行归一化处理,得到检测窗口的梯度方向直方图,并将检测窗口的梯度方向直方图作为人体特征向量。
本步骤S104中,由各块归一化后得到的检测窗口的梯度方向直方图,构成人体特征向量,从而实现人体检测。
由于梯度方向直方图是一种密集计算方式,因此计算量较大。为了减小计算量,提高检测速度,可以考虑选择在有较明显的人体轮廓的重点区域计算梯度方向直方图,从而达到降低维数的目的。因此,可以采用如图3所示的人体检测方法进行人体检测,具体包括以下步骤:
S201、将检测窗口划分为N个子窗口,N≥2。
S202、对各个子窗口内的图像进行一阶梯度计算。
S203、计算每个子窗口内的图像中各个单元格的梯度方向直方图。
S204、对每个子窗口内的图像中每个块内的所有单元格进行归一化处理,得到块的梯度方向直方图。
S205、对每个子窗口内的图像中的所有块进行归一化处理,得到子窗口的梯度方向直方图。
S206、将各子窗口的梯度方向直方图组成为人体特征向量。
本方案与前述第一方案(图2所示)的区别是增加步骤S201和S206。在步骤S201中将检测窗口划分为N(N≥2)个子窗口,例如将检测窗口中人体的头部区域、左手臂区域、右手臂区域和腿部区域四个重点区域作为子窗口,即将检测窗口划分为4个子窗口。然后在步骤S202-S205中,针对每个子窗口,采用与第一方案相同的方式计算出各个子窗口的梯度方向直方图。最后在步骤S206中,将各个子窗口的梯度方向直方图组成最终的人体特征向量。
通过实际计算发现,基于重点区域梯度方向直方图的方法得到的人体特征向量维数明显减少,有效降低了各个环节的计算量,提高了检测速度。
S12、当检测到至少两个人体时,分别识别各个人体的手势动作。
本发明实施例中,当检测到至少两个人体时,先将检测到的不同的人体划分为不同的区域,然后分别对各个区域进行手势识别,将识别出的各个区域的手势动作作为各个人体的手势动作。
S13、根据各个人体的手势动作执行相应的操作指令。
本发明实施例中,根据手势动作与操作指令的对应关系,将各个人体的手势动作翻译为对应的操作指令,并分别执行各个操作指令。从而实现了对多人手势的识别,能够同时响应多人的手势操作。
本发明实施例实现手势操作的方法,通过进行人体检测,当检测到至少两个人体时,则分别识别各个人体的手势动作,并根据各个人体的手势动作执行相应的操作指令,从而实现了同时识别多人的手势,并能够响应多人的手势操作,使得手势操作可以应用于多人操作的应用场景,扩展了手势操作的应用范围。
参照图4,提出本发明实现手势操作的装置一实施例,所述装置包括检测模块10、识别模块20和执行模块30,其中:检测模块10,设置为进行人体检测;识别模块20,设置为当检测到至少两个人体时,分别识别各个人体的手势动作;执行模块30,设置为根据各个人体的手势动作执行相应的操作指令。
本发明实施例中,检测模块10可以基于梯度方向直方图(Histogram of oriented gradient,HOG)、尺度不变特征转换(Scale-invariant feature transform,SIFT)、局部二值模式(Local Binary Pattern,LBP)、HARR等图像特征进行人体检测。
梯度方向直方图是类似于尺度不变特征转换的一种局域描述符,它通过计算局部区域上的梯度方向直方图来构成人体特征。与尺度不变特征转换不同的是,尺度不变特征转换是基于关键点的特征提取,是一种稀疏描述方法,而梯度方向直方图是密集的描述方法。
梯度方向直方图描述方法具有以下优点:梯度方向直方图表示的是边缘(梯度)的结构特征,因此可以描述局部的形状信息;位置和方向空间的量化,在一定程度上可以抑制平移和旋转带来的影响;同时采取在局部区域的归一化,可以部分抵消光照带来的影响。故本发明实施例优选基于梯度方向直方图进行人体检测。
如图5所示,检测模块10包括第一计算单元11、第二计算单元12、第一处理单元13和第二处理单元14,其中:第一计算单元11,设置为对检测窗口内的图像进行一阶梯度计算;第二计算单元12,设置为计算图像中各个单元格的梯度方向直方图;第一处理单元13,设置为对图像中每个块内的所有单元格进行归一化处理,得到块的梯度方向直方图;第二处理单元14,设置为对图像内的所有块进行归一化处理,得到检测窗口的梯度方向直方图,并将检测窗口的梯度方向直方图作为人体特征向量。
本发明实施例中,第一计算单元11将规范化大小(如64x128)的检测窗口(Detection Window)作为输入,通过一阶(一维)Sobel算子[-1,0,1]计算检测窗口内的图像水平和垂直方向上的梯度。
采用单一窗口作为分类器输入的好处是分类器对目标的位置与尺度具有不变性。对于一个待检测的输入图像来说,需要沿着水平和垂直方向移动检测窗口,同时要以多尺度对图像进行缩放以检测不同尺度下的人体。
梯度方向直方图是在被称为单元格(Cell)和块(Block)的网格内进行密集计算得到的。将图像分成若干单元格,每个单元格由多个像素构成,而块则是由若干相邻的单元格组成。
本发明实施例中,第二计算单元12如图6所示,包括计算子单元121和统计子单元122,其中:计算子单元121,设置为计算图像内每个像素的梯度;统计子单元122,设置为统计出图像内每个单元格中所有像素的梯度方向直方图,即该单元格的梯度方向直方图。
统计子单元122如图7所示,包括划分子单元1221和加权计算子单元1222121,其中:划分子单元1221,设置为针对每个单元格,将[0~π]划分为多个区间;加权计算子单元1222121,设置为根据单元格内各像素的梯度方向进行加权投票计算,得到该单元格中所有像素的梯度方向直方图。
加权计算子单元1222121在进行加权投票计算时,每个像素的权重为优选为该像素的梯度幅度。为了消除混淆,加权计算子单元1222121优选采用三线性差值(Trilinear Interpolationi)进行加权投票计算。
加权计算子单元1222121遍历图像中的每个单元格,得到图像中各个单元格的梯度方向直方图。
在块内,第一处理单元13对该块内的单元格的梯度方向直方图进行归一化处理,以消除光照的影响,从而得到该块的梯度方向直方图。第一处理单元13遍历图像中的每个块,得到图像中每个块的梯度方向直方图。
第二处理单元14将各块归一化后得到的检测窗口的梯度方向直方图,构成人体特征向量,从而实现人体检测。
由于梯度方向直方图是一种密集计算方式,因此计算量较大。为了减小计算量,提高检测速度,可以考虑选择在有较明显的人体轮廓的重点区域计算梯度方向直方图,从而达到降低维数的目的。
因此,检测模块10还可以如图8所示,包括第二划分单元101、第三计算单元102、第四计算单元103、第三处理单元104、第四处理单元105和组合单元106,其中:第二划分单元101,设置为将检测窗口划分为N个子窗口,N≥2;第三计算单元102,设置为对各个子窗口内的图像进行一阶梯度计算;第四计算单元103,设置为计算每个子窗口内的图像中各个单元格的梯度方向直方图;第三处理单元104,设置为对每个子窗口内的图像中每个块内的所有单元格进行归一化处理,得到块的梯度方向直方图;第四处理单元105,设置为对每个子窗口内的图像中的所有块进行归一化处理,得到子窗口的梯度方向直方图;组合单元106,设置为将各子窗口的梯度方向直方图组成为人体特征向量。
举例而言:第二划分单元101将检测窗口划分为4个子窗口,例如将检测窗口中人体的头部区域、左手臂区域、右手臂区域和腿部区域四个重点区域作为子窗口。第三计算单元102采用与第一计算单元11相同的方式对各个子窗口内的图像进行一阶梯度计算。第四计算单元103采用与第二计算单元12相同的方式计算出每个子窗口内的图像中各个单元格的梯度方向直方图。第三处理单元104采用与第一处理单元13相同的方式对每个子窗口内的图像中每个块内的所有单元格进行归一化处理。第四处理单元105采用与第二处理单元14相同的方式对每个子窗口内的图像中的所有块进行归一化处理。最后由组合单元106将各个子窗口的梯度方向直方图组成最终的人体特征向量。
通过实际计算发现,基于重点区域梯度方向直方图的方法得到的人体特征向量维数明显减少,有效降低了各个环节的计算量,提高了检测速度。
本发明实施例中,识别模块20如图9所示,包括第一划分单元21和手势识别单元22,其中:第一划分单元21,设置为将检测到的不同的人体划分为不同的区域;手势识别单元22,设置为分别对各个区域进行手势识别,将识别出的各个区域的手势动作作为各个人体的手势动作。
本发明实施例中,执行模块30根据手势动作与操作指令的对应关系,将各个人体的手势动作翻译为对应的操作指令,并分别执行各个操作指令。从而实现了对多人手势的识别,能够同时响应多人的手势操作。
本发明实施例实现手势操作的装置,通过进行人体检测,当检测到至少两个人体时,则分别识别各个人体的手势动作,并根据各个人体的手势动作执行相应的操作指令,从而实现了同时识别多人的手势,并能够响应多人的手势操作,使得手势操作可以应用于多人操作的应用场景,扩展了手势操作的应用范围。
本发明同时提出一种实现手势操作的装置,其包括存储器、处理器和至少一个被存储在存储器中并被配置为由处理器执行的应用程序,所述应用程序被配置为用于执行实现手势操作的方法。所述方法包括以下步骤:进行人体检测;当检测到至少两个人体时,分别识别各个人体的手势动作;根据各个人体的手势动作执行相应的操作指令。本实施例中所描述的实现手势操作的方法为本发明中上述实施例所涉及的实现手势操作的方法,在此不再赘述。
本领域技术人员可以理解,本发明包括涉及用于执行本申请中所述操作中的一项或多项的设备。这些设备可以为所需的目的而专门设计和制造,或者也可以包括通用计算机中的已知设备。这些设备具有存储在其内的计算机程序,这些计算机程序选择性地激活或重构。这样的计算机程序可以被存储在设备(例如,计算机)可读介质中或者存储在适于存储电子指令并分别耦联到总线的任何类型的介质中,所述计算机可读介质包括但不限于任何类型的盘(包括软盘、硬盘、光盘、CD-ROM、和磁光盘)、ROM(Read-Only Memory,只读存储器)、RAM(Random Access Memory,随机存储器)、EPROM(Erasable Programmable Read-Only Memory,可擦写可编程只读存储器)、EEPROM(Electrically Erasable Programmable Read-Only Memory,电可擦可编程只读存储器)、闪存、磁性卡片或光线卡片。也就是,可读介质包括由设备(例如,计算机)以能够读的形式存储或传输信息的任何介质。
本技术领域技术人员可以理解,可以用计算机程序指令来实现这些结构图和/或框图和/或流图中的每个框以及这些结构图和/或框图和/或流图中的框的组合。本技术领域技术人员可以理解,可以将这些计算机程序指令提供给通用计算机、专业计算机或其他可编程数据处理方法的处理器来实现,从而通过计算机或其他可编程数据处理方法的处理器来执行本发明公开的结构图和/或框图和/或流图的框或多个框中指定的方案。
本技术领域技术人员可以理解,本发明中已经讨论过的各种操作、方法、流程中的步骤、措施、方案可以被交替、更改、组合或删除。进一步地,具有本发明中已经讨论过的各种操作、方法、流程中的其他步骤、措施、方案也可以被交替、更改、重排、分解、组合或删除。进一步地,现有技术中的具有与本发明中公开的各种操作、方法、流程中的步骤、措施、方案也可以被交替、更改、重排、分解、组合或删除。
以上所述仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (20)

  1. 一种实现手势操作的方法,包括以下步骤:
    进行人体检测;
    当检测到至少两个人体时,分别识别各个人体的手势动作;
    根据各个人体的手势动作执行相应的操作指令。
  2. 根据权利要求1所述的实现手势操作的方法,其中,所述分别识别各个人体的手势动作的步骤包括:
    将检测到的不同的人体划分为不同的区域;
    分别对各个区域进行手势识别,将识别出的各个区域的手势动作作为各个人体的手势动作。
  3. 根据权利要求1所述的实现手势操作的方法,其中,所述进行人体检测的步骤包括:基于梯度方向直方图进行人体检测。
  4. 根据权利要求3所述的实现手势操作的方法,其中,所述基于梯度方向直方图进行人体检测的步骤包括:
    对检测窗口内的图像进行一阶梯度计算;
    计算所述图像中各个单元格的梯度方向直方图;
    对所述图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
    对所述图像内的所有块进行归一化处理,得到所述检测窗口的梯度方向直方图,并将所述检测窗口的梯度方向直方图作为人体特征向量。
  5. 根据权利要求3所述的实现手势操作的方法,其中,所述基于梯度方向直方图进行人体检测的步骤包括:
    将检测窗口划分为N个子窗口,N≥2;
    对各个子窗口内的图像进行一阶梯度计算;
    计算每个子窗口内的图像中各个单元格的梯度方向直方图;
    对每个子窗口内的图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
    对每个子窗口内的图像中的所有块进行归一化处理,得到所述子窗口的梯度方向直方图;
    将各子窗口的梯度方向直方图组成为人体特征向量。
  6. 根据权利要求4所述的实现手势操作的方法,其中,所述计算所述图像内各个单元格的梯度方向直方图的步骤包括:
    计算所述图像内每个像素的梯度;
    统计出所述图像内每个单元格中所有像素的梯度方向直方图。
  7. 根据权利要求6所述的实现手势操作的方法,其中,所述统计出所述图像内每个单元格中所有像素的梯度方向直方图的步骤包括:
    针对每个单元格,将[0~π]划分为多个区间;
    根据所述单元格内各像素的梯度方向进行加权投票计算,得到所述单元格中所有像素的梯度方向直方图。
  8. 根据权利要求7所述的实现手势操作的方法,其中,加权投票计算时,每个像素的权重为所述像素的梯度幅度。
  9. 根据权利要求7所述的实现手势操作的方法,其中,所述根据所述单元格内各像素的梯度方向进行加权投票计算的步骤包括:
    采用三线性差值进行加权投票计算。
  10. 根据权利要求5所述的实现手势操作的方法,其中,N=4。
  11. 一种实现手势操作的装置,包括:
    检测模块,设置为进行人体检测;
    识别模块,设置为当检测到至少两个人体时,分别识别各个人体的手势动作;
    执行模块,设置为根据各个人体的手势动作执行相应的操作指令。
  12. 根据权利要求11所述的实现手势操作的装置,其中,所述识别模块包括:
    第一划分单元,设置为将检测到的不同的人体划分为不同的区域;
    手势识别单元,设置为分别对各个区域进行手势识别,将识别出的各个区域的手势动作作为各个人体的手势动作。
  13. 根据权利要求11所述的实现手势操作的装置,其中,所述检测模块设置为:基于梯度方向直方图进行人体检测。
  14. 根据权利要求13所述的实现手势操作的装置,其中,所述检测模块包括:
    第一计算单元,设置为对检测窗口内的图像进行一阶梯度计算;
    第二计算单元,设置为计算所述图像中各个单元格的梯度方向直方图;
    第一处理单元,设置为对所述图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
    第二处理单元,设置为对所述图像内的所有块进行归一化处理,得到所述检测窗口的梯度方向直方图,并将所述检测窗口的梯度方向直方图作为人体特征向量。
  15. 根据权利要求13所述的实现手势操作的装置,其中,所述检测模块包括:
    第二划分单元,设置为将检测窗口划分为N个子窗口,N≥2;
    第三计算单元,设置为对各个子窗口内的图像进行一阶梯度计算;
    第四计算单元,设置为计算每个子窗口内的图像中各个单元格的梯度方向直方图;
    第三处理单元,设置为对每个子窗口内的图像中每个块内的所有单元格进行归一化处理,得到所述块的梯度方向直方图;
    第四处理单元,设置为对每个子窗口内的图像中的所有块进行归一化处理,得到所述子窗口的梯度方向直方图;
    组合单元,设置为将各子窗口的梯度方向直方图组成为人体特征向量。
  16. 根据权利要求14所述的实现手势操作的装置,其中,所述第二计算单元包括:
    计算子单元,设置为计算所述图像内每个像素的梯度;
    统计子单元,设置为统计出所述图像内每个单元格中所有像素的梯度方向直方图。
  17. 根据权利要求16所述的实现手势操作的装置,其中,所述统计子单元包括:
    划分子单元,设置为针对每个单元格,将[0~π]划分为多个区间;
    加权计算子单元,设置为根据所述单元格内各像素的梯度方向进行加权投票计算,得到所述单元格中所有像素的梯度方向直方图。
  18. 根据权利要求17所述的实现手势操作的装置,其中,加权投票计算时,每个像素的权重为所述像素的梯度幅度。
  19. 根据权利要求17所述的实现手势操作的装置,其中,所述加权计算子单元采用三线性差值进行加权投票计算。
  20. 一种实现手势操作的装置,包括存储器、处理器和至少一个被存储在所述存储器中并被配置为由所述处理器执行的应用程序,其中,所述应用程序被配置为用于执行权利要求1所述的实现手势操作的方法。
PCT/CN2018/077454 2018-02-09 2018-02-27 实现手势操作的方法和装置 WO2019153379A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810135317.4 2018-02-09
CN201810135317.4A CN108304817B (zh) 2018-02-09 2018-02-09 实现手势操作的方法和装置

Publications (1)

Publication Number Publication Date
WO2019153379A1 true WO2019153379A1 (zh) 2019-08-15

Family

ID=62864938

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/077454 WO2019153379A1 (zh) 2018-02-09 2018-02-27 实现手势操作的方法和装置

Country Status (2)

Country Link
CN (1) CN108304817B (zh)
WO (1) WO2019153379A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111142664B (zh) * 2019-12-27 2023-09-01 恒信东方文化股份有限公司 一种多人实时手部追踪系统及追踪方法
CN112328090B (zh) * 2020-11-27 2023-01-31 北京市商汤科技开发有限公司 手势识别方法及装置、电子设备和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101874234A (zh) * 2008-09-29 2010-10-27 松下电器产业株式会社 用户接口装置、用户接口方法、以及记录媒体
CN104616028A (zh) * 2014-10-14 2015-05-13 北京中科盘古科技发展有限公司 基于空间分割学习的人体肢体姿势动作识别方法
CN105912982A (zh) * 2016-04-01 2016-08-31 北京明泰朗繁精密设备有限公司 一种基于肢体动作识别的控制方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102789568B (zh) * 2012-07-13 2015-03-25 浙江捷尚视觉科技股份有限公司 一种基于深度信息的手势识别方法
CN104268528B (zh) * 2014-09-28 2017-10-17 中智科创机器人有限公司 一种人群聚集区域检测方法和装置
CN104751513A (zh) * 2015-03-12 2015-07-01 深圳市同洲电子股份有限公司 一种建立人体骨骼模型的方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101874234A (zh) * 2008-09-29 2010-10-27 松下电器产业株式会社 用户接口装置、用户接口方法、以及记录媒体
CN104616028A (zh) * 2014-10-14 2015-05-13 北京中科盘古科技发展有限公司 基于空间分割学习的人体肢体姿势动作识别方法
CN105912982A (zh) * 2016-04-01 2016-08-31 北京明泰朗繁精密设备有限公司 一种基于肢体动作识别的控制方法及装置

Also Published As

Publication number Publication date
CN108304817A (zh) 2018-07-20
CN108304817B (zh) 2019-10-29

Similar Documents

Publication Publication Date Title
US20200242424A1 (en) Target detection method and apparatus
US9818022B2 (en) Method of detecting object in image and image processing device
US10032067B2 (en) System and method for a unified architecture multi-task deep learning machine for object recognition
WO2021114832A1 (zh) 样本图像数据增强方法、装置、电子设备及存储介质
EP2519915B1 (en) Method and apparatus for local binary pattern based facial feature localization
US11062124B2 (en) Face pose detection method, device and storage medium
EP2788838A1 (en) Method and apparatus for identifying a gesture based upon fusion of multiple sensor signals
US10796447B2 (en) Image detection method, apparatus and system and storage medium
WO2020082731A1 (zh) 电子装置、证件识别方法及存储介质
CN107944381B (zh) 人脸跟踪方法、装置、终端及存储介质
US11094072B2 (en) System and method for providing single image depth estimation based on deep neural network
US10846565B2 (en) Apparatus, method and computer program product for distance estimation between samples
JP2017523498A (ja) 効率的なフォレストセンシングに基づくアイトラッキング
CN108763350B (zh) 文本数据处理方法、装置、存储介质及终端
CN112016502B (zh) 安全带检测方法、装置、计算机设备及存储介质
CN112115937A (zh) 目标识别方法、装置、计算机设备和存储介质
WO2019153379A1 (zh) 实现手势操作的方法和装置
CN113569607A (zh) 动作识别方法、装置、设备以及存储介质
CN113780363B (zh) 一种对抗样本防御方法、系统、计算机及介质
WO2021051580A1 (zh) 基于分组批量的图片检测方法、装置及存储介质
CN114972861A (zh) 对抗样本生成方法、装置、设备及存储介质
CN114882226A (zh) 图像处理方法、智能终端及存储介质
CN113722692A (zh) 身份识别的装置及其方法
CN112733807A (zh) 一种人脸比对的图卷积神经网络训练方法及装置
US10311861B1 (en) System and method for encoding data in a voice recognition integrated circuit solution

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18905064

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18905064

Country of ref document: EP

Kind code of ref document: A1