WO2017092431A1 - Procédé et dispositif de détection de main humaine basés sur une couleur de peau - Google Patents

Procédé et dispositif de détection de main humaine basés sur une couleur de peau Download PDF

Info

Publication number
WO2017092431A1
WO2017092431A1 PCT/CN2016/096982 CN2016096982W WO2017092431A1 WO 2017092431 A1 WO2017092431 A1 WO 2017092431A1 CN 2016096982 W CN2016096982 W CN 2016096982W WO 2017092431 A1 WO2017092431 A1 WO 2017092431A1
Authority
WO
WIPO (PCT)
Prior art keywords
skin
pixel
hsv
image
binary image
Prior art date
Application number
PCT/CN2016/096982
Other languages
English (en)
Chinese (zh)
Inventor
李艳杰
Original Assignee
乐视控股(北京)有限公司
乐视致新电子科技(天津)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐视控股(北京)有限公司, 乐视致新电子科技(天津)有限公司 filed Critical 乐视控股(北京)有限公司
Publication of WO2017092431A1 publication Critical patent/WO2017092431A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • G06V40/11Hand-related biometrics; Hand pose recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering

Definitions

  • the present application relates to the field of computer vision, and in particular, to a human hand detection method and apparatus based on skin color.
  • gesture recognition is increasingly being valued.
  • a gesture-based human-computer interaction system it is necessary to first acquire the position of the hand in the image.
  • the most common method currently used is to obtain gesture information by detecting the skin color.
  • the most common segmentation method at present is based on skin color segmentation.
  • the statistic-based skin color detection method mainly uses skin color statistical model to detect skin color, which mainly includes two steps: color space transformation and skin color modeling; physics-based method introduces the interaction between light and skin in skin color detection, through research Skin color reflection model and spectral characteristics for skin color detection.
  • the recognition efficiency of the human hand shape is low, the false detection rate is high, and it is very susceptible to illumination, thereby causing the accuracy of gesture recognition to be limited.
  • the embodiment of the present invention provides a human hand detection method and device based on skin color, which is used to solve the defects in the prior art that the skin color detection and the human hand recognition method based on statistics are low in efficiency, high in false detection rate, and highly susceptible to illumination.
  • the recognition of human hand based on skin color detection is highly efficient and accurate, thereby further improving the accuracy of gesture recognition.
  • the embodiment of the present application provides a human hand detection method based on skin color, including:
  • the pre-trained K-nearest neighbor classifier is used to determine whether the largest connected area is a hand shape, thereby realizing human hand recognition.
  • the embodiment of the present application provides a human hand detecting device based on skin color, including:
  • An image conversion module configured to convert the acquired image to be detected from an RGB color space to an HSV color space to acquire an HSV image, and convert the image to be detected from an RGB color space to an r-g color space to obtain an r-g image;
  • a binary map obtaining module configured to traverse each pixel in the HSV image, and convert the HSV image into a first binary image according to a pre-established HSV histogram model, and traverse the read
  • Each pixel in the rg image converts the rg image into a second binary image according to a pre-established mixed Gaussian model
  • bitwise operation module configured to perform a bitwise AND operation on the first binary image and the second binary image to obtain a comprehensive binary image
  • a filtering module configured to filter the integrated binary image to obtain an optimized binary image
  • a connected area judging module configured to analyze a largest connected area in the optimized binary image, and use the largest connected area as a skin area
  • the human hand identification module is configured to determine whether the maximum connected area is a hand shape using a pre-trained K-nearest neighbor classifier, thereby realizing human hand recognition.
  • An embodiment of the present application provides an electronic device, including the skin color based person according to any of the foregoing embodiments. Hand detection method.
  • the embodiment of the present application provides a non-transitory computer readable storage medium, wherein the non-transitory computer readable storage medium can store computer instructions, which can implement the skin color based hand provided by the embodiments of the present application. Part or all of the steps in each implementation of the detection method.
  • An embodiment of the present application provides an electronic device, including: one or more processors; and a memory; wherein the memory stores instructions executable by the one or more processors, the instructions being set to A method for detecting a human hand based on skin color according to any of the above-mentioned applications of the present application.
  • An embodiment of the present application provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, when the program instructions are executed by a computer, The computer is caused to perform the human skin detection method based on any of the above-mentioned embodiments of the present application.
  • the skin color detecting method and device provided by the embodiments of the present application achieve high-accuracy detection of the skin region by comprehensively applying the HSV histogram, the mixed Gaussian model, the filtering denoising, and the connected domain extraction method, and at the same time, through the K-nearest neighbor
  • the classifier enables fast and accurate manual extraction.
  • Embodiment 1 is a technical flowchart of Embodiment 1 of the present application.
  • Embodiment 2 is a technical flowchart of Embodiment 2 of the present application.
  • Embodiment 3 is a technical flowchart of Embodiment 3 of the present application.
  • FIG. 4 is a schematic structural diagram of a device according to Embodiment 4 of the present application.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • a human hand detection method based on skin color includes the following steps:
  • Step 110 Convert the acquired image to be detected from the RGB color space to the HSV color space to obtain an HSV image, and convert the image to be detected from the RGB color space to the r-g color space to obtain an r-g image;
  • step 111 In order to make the logical description clearer, the following steps are divided into two steps: step 111 and step 112. It should be noted that there is no order between step 111 and step 112, and the following description is performed in the order of Does not constitute a limit.
  • Step 111 Convert the acquired image to be detected from the RGB color space to the HSV color space to obtain an HSV image.
  • the RGB color space is obtained by changing the three color channels of red (R), green (G), and blue (B) and superimposing them on each other.
  • RGB stands for red and green.
  • HSV HumanSaturation Value
  • the color space is a color space created based on the intuitive characteristics of the color. H, S, and V represent hue, saturation, and brightness, respectively. Converting the image to be detected from RGB color space to HSV color space overcomes the influence of illumination changes on skin color detection to some extent.
  • Both RGB and CMY color models are hardware oriented
  • the HSV (Hue Saturation Value) color model is user-oriented.
  • the three-dimensional representation of the HSV model evolved from the RGB cube. Imagine looking at the hexagonal shape of the cube from the white vertices of the RGB along the cube's diagonal to the black vertices.
  • the hexagonal boundary represents color, the horizontal axis represents purity, and the brightness is measured along the vertical axis.
  • the image to be detected is converted from the RGB color space to the HSV color space by using the following formula:
  • V max(R, G, B)
  • R is the red value of the pixel
  • G is the green value of the pixel
  • B is the blue value of the pixel
  • max() indicates the maximum value operation
  • min() indicates the minimum value operation
  • V is the maximum value among R, G, and B
  • H, S, and V are the color values corresponding to the pixel points after the conversion, respectively.
  • Step 112 Convert the image to be detected from an RGB color space to an r-g color space to obtain an r-g image.
  • the RGB image is converted from the RGB color space to the r-g color space by using the following formula:
  • R is the red value of the pixel
  • G is the green value of the pixel
  • B is the blue value of the pixel
  • r, g, b are the color values corresponding to the pixel after conversion .
  • the RGB color space here refers to a variety of colors by changing the three color channels of red (R), green (G), and blue (B) and superimposing them on each other.
  • RGB has 256 levels each. Brightness, expressed as numbers from 0, 1, 2... up to 255.
  • An RGB color value specifies the relative brightness of the three primary colors of red, green, and blue, producing a specific color for display, that is, any one color can be recorded and expressed by a set of RGB values.
  • the RGB value corresponding to a pixel is (149, 123, 98), and the color of this pixel is a superposition of different brightnesses of the three colors of RGB.
  • the RGB value corresponding to each pixel in the picture can be directly obtained by using OpenCv, and the implementation code can be like this:
  • channels 0, 1, and 2 correspond to the brightness values of the three colors of blue, green, and red, respectively;
  • converting the color space from RGB to r-g is actually a normalization process for RGB colors.
  • this normalization process when a pixel is affected by light or shadow and the color channel R, G, and B values change, the numerator and denominator in the normalization formula change simultaneously, and the normalized value obtained actually The float is not large, this transformation removes the information of the light from the image, thus reducing the effects of lighting.
  • the pixel value of pixel A at the time T1 before normalization is: RGB (30, 60, 90), and at time T2, the color values of the three color channels of RGB are changed due to the influence of illumination, and the pixel value of pixel A is changed. It becomes RGB (60, 120, 180).
  • the pixel value of pixel A at time T1 is: RGB (1/6, 1/3, 2/3)
  • the pixel value of pixel A at time T2 is: RGB (1/ 6, 1/3, 2/3). It can be seen that the values of the normalized RGB at the time of T1 and T2 do not change.
  • Step 120 traversing and reading each pixel in the HSV image, and converting the HSV image into a first binary image according to a pre-established HSV histogram model, and traversing and reading each of the rg images a pixel point, converting the rg image into a second binary image according to a pre-established mixed Gaussian model;
  • step 120 is split into five steps: step 121 to step 125.
  • step 121 to step 125 There is no fixed sequence in the actual implementation of the steps 122 to 125, and the embodiment of the present application is not limited.
  • Step 121 Read an HSV value of the pixel, and calculate a matching probability value between the HSV value and an HSV histogram model of the skin pixel and an HSV histogram model of the non-skin pixel, respectively, according to the matching.
  • the degree value determines whether the pixel belongs to a skin area
  • the pixel is assigned with x, and if the pixel does not belong to the skin region, the pixel is assigned with y, thereby obtaining the first binary image.
  • x generally takes 255
  • y generally takes 0.
  • the pre-trained HSV histogram model stores a histogram distribution of HSV values of skin pixels and non-skin pixels. This distribution is used as a reference for determining whether a new pixel is a skin pixel in the embodiment of the present application. .
  • the implementation is: reading an HSV value of the pixel in the image to be detected, and calculating a matching probability value between the HSV value and the HSV histogram model of the skin pixel and the HSV histogram model of the non-skin pixel, respectively. And determining, according to the matching degree value, whether the pixel point belongs to a skin area.
  • the detection result has a certain stability to the change of the illumination.
  • S122 calculating a first probability density of the pixel point under the skin-mixed Gaussian model and a second probability density of the pixel point under the non-skin mixed Gaussian model;
  • the mixed Gaussian model GMM also known as MOG, is an extension of the single Gaussian model, which uses K (basically 3 to 10) Gaussian models to characterize the individual pixels in the image.
  • x is the d-dimensional Euclidean space
  • a is the mean vector of the single Gaussian model
  • S is the covariance matrix of the single Gaussian model
  • T represents the transpose operation of the matrix
  • () -1 represents the inverse of the matrix .
  • the formula of the mixed Gaussian model is formed by adding K single Gaussian models according to the weights, and is expressed by the following formula:
  • ⁇ k is the weight of the kth Gaussian model
  • m is the number of preset Gaussian models
  • p k (x) is the kth single Gaussian model.
  • x belongs to d-dimensional Euclidean space
  • m is the number of preset Gaussian models
  • p k (x) is the probability density of the k-th Gaussian model
  • a k is the mean of the k-th Gaussian model.
  • S k is the covariance matrix of the kth Gaussian model
  • ⁇ k is the weight of the kth Gaussian model;
  • a mixed Gaussian model is established for the skin pixel and the non-skin pixel respectively, and the formulas of the two models are the same, except that the parameters in the model, that is, the mean vector a k and the covariance matrix S k are different.
  • the embodiment of the present application calculates its first probability density under the skin-mixed Gaussian model, and calculates its second probability density under the non-skin mixed Gaussian model until all pixel points are traversed.
  • the traversing process may be traversing by column by column, or may randomly select a pixel to determine whether it is a pixel of the skin region, and if so, first within a certain size neighborhood thereof Pixels are traversed, and the application is not limited.
  • the mean vector of the skin-mixed Gaussian model is a k1
  • the covariance matrix is S k1
  • the weights of the plurality of single Gaussian models respectively correspond to ⁇ k1
  • the mean vector of the non-skin mixed Gaussian model is a k2
  • the covariance matrix is S k2
  • the weights corresponding to the multiple single Gaussian models are respectively ⁇ k2 ,
  • the calculation formula of the posterior probability is as follows:
  • P is the value of the posterior probability
  • p skin is the first probability density
  • p non-skin is the second probability density
  • the embodiment of the present application sets the posterior probability threshold to 0.5, that is, when the value of the posterior probability exceeds 0.5, it is determined that the pixel corresponding to the posterior probability belongs to the skin region.
  • the posterior probability threshold of 0.5 is an empirical value. It is judged by a large number of experiments that if a pixel point belongs to the skin pixel, the posterior probability exceeds 0.5, and this pixel belongs to the skin area of the image.
  • the posterior probability threshold may also be dynamically adjusted, and the application is not limited thereto.
  • the first binary image and the second binary image under the mixed Gaussian model are described.
  • Step 130 performing a bitwise AND operation on the first binary image and the second binary image to obtain a comprehensive binary image.
  • the operation principle of the bitwise AND operation is that if both numbers in the same position are 1, the operation result is 1; if one is not 1, the operation result is 0.
  • the result of the bitwise operation is The pixel belongs to the skin pixel; if the matching result of the HSV histogram model and the mixed Gaussian model is inconsistent, the result of the bitwise operation is that the pixel belongs to a non-skin pixel.
  • Step 140 Filter the integrated binary image to obtain an optimized binary image.
  • the integrated binary image is denoised by median filtering to remove some scattered pixel points in the binarized image, thereby improving the efficiency of subsequently searching for the connected region.
  • Median filtering is a very mature algorithm, which can eliminate the noise of the image.
  • the basic principle is that the pixel value of a certain position in the target image depends on the same position of the original image and the pixel value in the vicinity, for example, the pixel of a certain position of the original image. There are 9 pixels in the vicinity thereof, and after sorting the 9 pixel values, the pixel value located in the middle is taken as the pixel value of the target image pixel.
  • Step 150 Analyze a largest connected area in the optimized binary image, and use the largest connected area as a skin area.
  • Connected Component generally refers to an image region (Region, Blob) composed of foreground pixel points having the same pixel value and adjacent in the image.
  • Connected Component Analysis refers to finding and marking each connected area in an image.
  • the object of the connected area analysis processing is a binarized image.
  • a connected area is composed of adjacent pixels having the same pixel value, so that the connected area can be found in the image by these two conditions, and each connected area is given A unique label (Label) to distinguish other connected areas.
  • the two-pass scanning method means that by scanning two images, all connected areas existing in the image can be found and marked.
  • the main implementation idea is as follows: a label is given to each pixel position during the first scan, and one or more different labels may be assigned to the pixel set in the same connected area during the scanning process, so these need to be connected to the same one. Regions but labels with different values are merged, that is, the equality relationship between them is recorded; the second pass scan is to classify the pixels marked by equal_labels with equal relationship into one connected region and give the same label (usually this label is equal_labels) The minimum value).
  • the seed filling method is derived from computer graphics and is often used to fill a graphic.
  • the main idea is to select a foreground pixel as a seed, and then merge the foreground pixels adjacent to the seed into the same pixel set according to the two basic conditions of the connected region (the pixel values are the same and the positions are adjacent).
  • the set of pixels is a connected area.
  • the pixel neighboring relationship in the connected area mainly has 4 neighborhoods and 8 neighborhoods.
  • the 4th neighborhood is used to analyze the largest connected area in the optimized binary image.
  • Step 160 Determine whether the largest connected area is a hand shape by using a pre-trained K-nearest neighbor classifier, thereby implementing recognition of a gesture.
  • the K-nearest neighbor classifier is a very mature classifier.
  • the principle is that if the number of data of the i-th class is the majority of the M data closest to a certain data, the data belongs to the i-th class.
  • the data is generally a vector that can represent the characteristics of the class.
  • the key to pre-training the K-nearest neighbor classifier is to extract the features of the sample pictures and classify the sample pictures into different classes based on these characteristics.
  • the embodiment of the present application selects the following four features:
  • Feature 1 Ratio of the square of the perimeter of the connected area to the area
  • Feature 2 area of the connected area
  • Feature 3 the probability mean value of the connected region pixels obtained by the GMM (mixed Gaussian model) belonging to the skin region;
  • Feature 4 the mean probability of the connected region pixels obtained by the HSV histogram model belonging to the skin;
  • the feature 3 and the feature 4 are calculated by calling the HSV histogram model and the GMM hybrid Gaussian model which are pre-trained in the embodiment of the present application, and are not described here.
  • the pre-trained K-nearest neighbor classifier obtains samples of the hand region and the non-hand region by using a certain number of hand-shaped and non-hand-shaped image samples and calculating features 1 to 4 of the largest connected region. For a connected graph to be detected, the above features 1 to 4 are extracted, and based on the statistical results of the samples, whether the human hand region is included in the connected graph can be determined.
  • the specific implementation may be such that the similarity ratios of the features 1 to 4 in the connectivity graph to the feature 1 to the feature 4 in the K neighbor neighbor classifier are determined one by one, and a reasonable threshold is set for the similarity rate. When the similarity ratio is greater than the threshold, it is determined that the connected graph to be detected includes a human hand region.
  • the skin pixels in the image to be detected are identified based on the HSV histogram and the GMM model detection by the image to be detected; further, the comprehensive operation and filtering of the two different model detection methods are used to obtain the The optimized binary image corresponding to the image to be detected; through the analysis of the maximum connected region and the judgment of the K neighborhood classifier, the hand shape recognition is accurately realized, and the speed detection and the error detection of the hand shape in the prior art are effectively solved. Thereby indirectly improving the efficiency of gesture recognition in human-computer interaction.
  • FIG. 2 is a technical flowchart of the second embodiment of the present application.
  • the training of the HSV histogram model is mainly implemented by the following steps:
  • Step 210 Perform marking of the skin region and the non-skin region on the sample image to obtain a skin pixel sample and a non-skin pixel sample;
  • the marking of the sample can be done manually to ensure a high degree of accuracy of the sample.
  • Step 220 Convert the skin pixel sample and the non-skin pixel sample from an RGB color space to an HSV color space to obtain a skin HSV pixel sample and a non-skin HSV pixel sample;
  • step 110 of the first embodiment The specific implementation formula and the technical effect of the conversion from the RGB color space to the HSV color space are shown in step 110 of the first embodiment, and are not described herein again.
  • Step 230 Statistics the HSV value of the skin HSV pixel sample, and establish an HSV histogram model of the skin pixel according to the distribution of the HSV value of the skin HSV pixel sample;
  • the frequency distribution of the H value (hue), S value (saturation), and V value (brightness) is separately calculated for the pixel points of the skin sample, thereby establishing an HSV histogram model of the skin pixel, and at the same time The same operation is performed on the pixels of the non-skin sample.
  • the core of the present application is that the gray level of the HSV histogram model is compressed according to a preset proportional relationship to obtain an optimized histogram statistical effect.
  • H, S and V channels each have 256 gray levels, if all of the gray level histogram of length 224, is approximately 16 million, this effect can not be obtained when good statistical sample size is not large enough. Therefore, the embodiment of the present application compresses the length of the histogram, and the ratio of compression can be selected according to experience.
  • the H channel is compressed by 64 gray levels in a ratio of 4:2:1
  • the S channel is compressed to 32 gray levels
  • V channel is compressed to 16 gray levels
  • the length is 2 15 , which is 65536.
  • the HSV uses three different gray levels for the three channels, because the three channels of HSV are affected by the light intensity, the H (chrominance) channel is not affected by the illumination change, the V channel is proportional to the change of the light intensity, and the S channel is illuminated.
  • the degree of influence is somewhere in between.
  • Step 240 Statistics the HSV value of the non-skin HSV pixel sample, and according to the non-skin HSV The distribution of HSV values for pixel samples establishes an HSV histogram model of non-skin pixels.
  • step 230 The execution process and technical effect of establishing the HSV histogram model for the non-skin pixel samples are the same as the above step 230, and will not be described here. It should be noted that there is no actual order in the step 230 and the step 240. The embodiment of the present application is not limited.
  • the HSV histogram model of the skin pixel and the non-skin pixel is respectively established by training the skin sample and the non-skin sample and compressing the gray level of the HSV histogram, even if the number of training samples is small, Reduce the false detection rate of skin pixels.
  • FIG. 3 is a technical flowchart of Embodiment 3 of the present application.
  • the establishment of a mixed Gaussian model mainly includes the following steps:
  • Step 310 Mark a skin pixel area and a non-skin pixel area of the RGB sample picture to obtain a skin pixel sample and a non-skin pixel sample.
  • the RGB sample picture is first marked, which may be artificial, to distinguish the skin area and the non-skin area in the picture, that is, the skin pixel sample and the non-skin pixel sample are obtained. Pre-classifying the samples helps to improve the efficiency of the subsequent EM algorithm in calculating the parameters of the mixed Gaussian model and how close the parameters are to the actual model.
  • Step 320 Convert the skin pixel sample and the non-skin pixel sample from an RGB color space to an r-g color space;
  • R is the red value of the pixel
  • G is the green value of the pixel
  • B is the blue value of the pixel
  • r, g, b are the color values corresponding to the pixel after conversion .
  • Step 330 Calculate parameters of the skin pixel mixed Gaussian model and the non-skin pixel mixed Gaussian model according to the skin space converted skin sample and the non-skin pixel sample, respectively, using an expectation maximization algorithm.
  • the parameters include a k , S k and ⁇ k .
  • the mixed Gaussian model is a superposition of multiple single Gaussian models.
  • the weight of each single Gaussian model is different, that is, the data in the mixed Gaussian model is generated from several single Gaussian models.
  • the number K of the single Gaussian model needs to be set in advance, and ⁇ k is the weight of each single Gaussian model.
  • the Expectation Maximization (EM) algorithm is an algorithm for finding a parameter maximum likelihood estimate or a maximum a posteriori estimate in a probabilistic model, where the probability model relies on an unobservable hidden variable.
  • the EM algorithm provides an efficient iterative procedure to calculate the maximum likelihood estimate for these data.
  • the iteration is divided into two steps at each step: the Expectation step and the Maximization step, hence the EM algorithm.
  • the EM algorithm is a very mature algorithm and the derivation process is complicated, which is not described in detail in the embodiment of the present application.
  • Step 340 Establish a mixed Gaussian model according to the mixed Gaussian model formula.
  • the mean vector a k1 of the skin mixed Gaussian model, the covariance matrix S k1 and the weights ⁇ k1 corresponding to the multiple single Gaussian models can be calculated and substituted into the mixed Gaussian model formula.
  • the mean vector a k2 of the non-skin mixed Gaussian model, the covariance matrix S k2 , and the weights ⁇ k2 corresponding to the plurality of single Gaussian models respectively can be calculated, and the obtained non-skin mixture is obtained.
  • the Gaussian model is:
  • each pixel of the picture to be detected is read after the color space is transformed, and the pixel is substituted into the two models, and the pixel points are respectively calculated. Skin and p non-skin .
  • the EM algorithm is used to establish a mixed Gaussian model of skin pixels and non-skin pixels, and the prior art based on the histogram Compared with skin color detection, a large number of training samples are not needed, which saves various resource consumption and improves the efficiency of skin color detection.
  • the establishment of the HSV histogram model and the establishment of the mixed Gaussian model are not sequential, and the matching process between the image to be detected and any of the above two models is also in no order.
  • the layout of the various embodiments of the present application is merely illustrative of the respective establishment processes of the two models, and the order of use of the order in which they are established is not limited.
  • the non-transitory computer readable storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
  • a human hand detection method based on skin color mainly includes the following large modules: an image conversion module 410, a binary image acquisition module 420, and a bitwise position.
  • the image conversion module 410 is configured to convert the acquired image to be detected from an RGB color space to an HSV color space to acquire an HSV image, and convert the image to be detected from an RGB color space to an rg color space to obtain an rg image. ;
  • the binary map obtaining module 420 is configured to traverse each pixel in the HSV image and call the HSV histogram model pre-established by the model training module 460 to convert the HSV image into the first two a value image, and traversing each pixel point in the rg image, calling a mixed Gaussian model pre-established by the model training module to convert the rg image into a second binary image;
  • the bitwise operation module 430 is configured to perform a bitwise AND operation on the first binary image and the second binary image to obtain a comprehensive binary image;
  • the filtering module 440 is configured to filter the integrated binary image to obtain an optimized binary image
  • the connected area determining module 450 is configured to analyze a largest connected area in the optimized binary image, and use the largest connected area as a skin area.
  • model training module 460 is configured to:
  • model training module 460 is further configured to:
  • the binary map obtaining module 420 is further configured to:
  • the pixel is assigned with x, and if the pixel belongs to the skin region, the pixel is assigned with y, thereby obtaining the first binary image;
  • the binary map obtaining module 420 is further configured to:
  • the pixel point is attributed to the skin region
  • the pixel is assigned with x, and if the pixel does not belong to the skin region, the pixel is assigned with y, thereby obtaining the first binary image and the The second binary image is described.
  • the connected area determining module 450 is further configured to:
  • the pre-trained K-nearest neighbor classifier is used to determine whether the largest connected area is a hand shape, thereby realizing recognition of a gesture.
  • an electronic device comprising the skin color based human hand detecting device according to any of the preceding embodiments.
  • a non-transitory computer readable storage medium is also provided, the non-transitory computer readable storage medium storing computer executable instructions executable by any of the above methods.
  • FIG. 5 is a schematic diagram of a hardware structure of an electronic device for performing a skin-based human hand detection method according to an embodiment of the present application. As shown in FIG. 5, the device includes:
  • processors 510 and memory 520 one processor 510 is taken as an example in FIG.
  • the apparatus for performing the skin color based human hand detection method may further include: an input device 530 and an output device 440.
  • the processor 510, the memory 520, the input device 530, and the output device 540 may be connected by a bus or other means, as exemplified by a bus connection in FIG.
  • the memory 520 is used as a non-transitory computer readable storage medium, and can be used for storing a non-volatile software program, a non-volatile computer executable program, and a module, such as a skin-based human hand detection method in the embodiment of the present application.
  • Program instructions/modules for example, image conversion module 410, binary image acquisition module 420, bitwise operation module 430, filter module 440, connected region determination module 450, and model training module 460 shown in FIG. 4).
  • the processor 510 executes various functional applications and data processing of the electronic device by running non-volatile software programs, instructions, and modules stored in the memory 520, that is, on the implementation.
  • the method embodiment is based on a human hand detection method of skin color.
  • the memory 520 may include a storage program area and an storage data area, wherein the storage program area may store an operating system, an application required for at least one function; the storage data area may store data created according to use of the skin color-based human hand detection device, and the like.
  • the memory 520 may include a high speed random access memory, and may also include a nonvolatile memory such as at least one magnetic disk storage device, flash memory device, or other nonvolatile solid state storage device.
  • memory 520 can optionally include a memory remotely located relative to processor 510 that can be connected to a skin tone based hand detection device over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the input device 530 can receive input numeric or character information and generate key signal inputs related to user settings and function control of the skin tone based hand detection device.
  • the output device 540 can include a display device such as a display screen.
  • the one or more modules are stored in the memory 520, and when executed by the one or more processors 510, perform a skin tone based human hand detection method in any of the above method embodiments.
  • the electronic device of the embodiment of the present application exists in various forms, including but not limited to:
  • Mobile communication devices These devices are characterized by mobile communication functions and are mainly aimed at providing voice and data communication.
  • Such terminals include: smart phones (such as iPhone), multimedia phones, functional phones, and low-end phones.
  • Ultra-mobile personal computer equipment This type of equipment belongs to the category of personal computers, has computing and processing functions, and generally has mobile Internet access.
  • Such terminals include: PDAs, MIDs, and UMPC devices, such as the iPad.
  • Portable entertainment devices These devices can display and play multimedia content. Such devices include: audio, video players (such as iPod), handheld game consoles, e-books, and smart toys and portable car navigation devices.
  • the server consists of a processor, a hard disk, a memory, a system bus, etc.
  • the server is similar to a general-purpose computer architecture, but due to the need to provide highly reliable services, Therefore, it is highly demanded in terms of processing power, stability, reliability, security, scalability, and manageability.
  • the device embodiments described above are merely illustrative, wherein the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, ie may be located A place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. Those of ordinary skill in the art can understand and implement without deliberate labor.

Abstract

L'invention concerne un procédé de détection de main humaine basé sur une couleur de peau. Le procédé comprend les étapes suivantes : convertir une image acquise à détecter d'un espace de couleur RVB en un espace de couleur HSV pour acquérir une image HSV, et convertir l'image à détecter de l'espace de couleur RVB en un espace de couleur r-g pour acquérir une image r-g; convertir l'image HSV en une première image binaire, et convertir l'image r-g en une seconde image binaire; réaliser une opération ET au niveau du bit sur la première image binaire et la seconde image binaire afin d'obtenir une image binaire complète; filtrer l'image binaire complète pour acquérir une image binaire optimisée; analyser une région connectée maximale dans l'image binaire optimisée, et adopter la région connectée maximale en tant que région de peau; utiliser un classificateur de K voisins pré-formé pour déterminer si la région connectée maximale est en forme de main, réalisant ainsi une reconnaissance de main humaine. Le procédé présente une vitesse de détection rapide, et permet d'éviter efficacement la détection d'erreur de main humaine dans une reconnaissance de geste.
PCT/CN2016/096982 2015-12-01 2016-08-26 Procédé et dispositif de détection de main humaine basés sur une couleur de peau WO2017092431A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510870145.1A CN105893925A (zh) 2015-12-01 2015-12-01 基于肤色的人手检测方法及装置
CN201510870145.1 2015-12-01

Publications (1)

Publication Number Publication Date
WO2017092431A1 true WO2017092431A1 (fr) 2017-06-08

Family

ID=57002958

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/096982 WO2017092431A1 (fr) 2015-12-01 2016-08-26 Procédé et dispositif de détection de main humaine basés sur une couleur de peau

Country Status (2)

Country Link
CN (1) CN105893925A (fr)
WO (1) WO2017092431A1 (fr)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563272A (zh) * 2017-06-14 2018-01-09 南京理工大学 一种无重叠视域监控系统中目标匹配方法
CN109583331A (zh) * 2018-11-15 2019-04-05 复旦大学 基于深度学习的人手腕脉口位置精准定位方法
CN109784357A (zh) * 2018-11-19 2019-05-21 西安理工大学 一种基于统计模型的图像重拍检测方法
CN109977734A (zh) * 2017-12-28 2019-07-05 华为技术有限公司 图像处理方法和装置
CN110232690A (zh) * 2019-06-05 2019-09-13 广东工业大学 一种图像分割的方法、系统、设备及计算机可读存储介质
CN110473177A (zh) * 2019-07-30 2019-11-19 上海媚测信息科技有限公司 皮肤色素分布预测方法、图像处理系统及存储介质
CN110473191A (zh) * 2019-08-09 2019-11-19 深圳市三宝创新智能有限公司 一种红疹识别方法
CN110619637A (zh) * 2019-08-16 2019-12-27 上海吉汭泰网络科技有限公司 基于模板的服装图像多特征统计分割方法
CN110728286A (zh) * 2019-09-24 2020-01-24 西安理工大学 一种基于火花图像的砂带磨削材料去除率识别方法
CN111079637A (zh) * 2019-12-12 2020-04-28 武汉轻工大学 田间图像中分割油菜花的方法、装置、设备及存储介质
CN111242052A (zh) * 2020-01-16 2020-06-05 成都唐源电气股份有限公司 一种接触网刚柔导线自动判别方法及装置
CN111325728A (zh) * 2020-02-19 2020-06-23 南方科技大学 产品缺陷检测方法、装置、设备及存储介质
CN111429535A (zh) * 2020-03-13 2020-07-17 深圳市雄帝科技股份有限公司 对图像中衣服与背景差异度评估方法、系统、设备及介质
CN111445466A (zh) * 2020-04-01 2020-07-24 济南浪潮高新科技投资发展有限公司 一种螺栓防漏拧检测方法及设备、介质
CN111754486A (zh) * 2020-06-24 2020-10-09 北京百度网讯科技有限公司 图像处理方法、装置、电子设备及存储介质
CN111881789A (zh) * 2020-07-14 2020-11-03 深圳数联天下智能科技有限公司 肤色识别方法、装置、计算设备及计算机存储介质
CN112150438A (zh) * 2020-09-23 2020-12-29 创新奇智(青岛)科技有限公司 断线检测方法、装置、电子设备及存储介质
CN112446865A (zh) * 2020-11-25 2021-03-05 创新奇智(广州)科技有限公司 瑕疵识别方法、装置、设备和存储介质
CN113361483A (zh) * 2021-07-07 2021-09-07 合肥英睿系统技术有限公司 一种交通限速标志检测方法、装置、设备及存储介质
CN113812965A (zh) * 2021-08-19 2021-12-21 杭州回车电子科技有限公司 睡眠状态识别方法、装置、电子装置和存储介质
CN116503404A (zh) * 2023-06-27 2023-07-28 梁山县创新工艺品股份有限公司 塑料玩具质量检测方法、装置、电子设备及存储介质
CN117437217A (zh) * 2023-12-18 2024-01-23 武汉博源新材料科技集团股份有限公司 一种基于图像识别的纸塑产品分拣方法及系统
CN117522863A (zh) * 2023-12-29 2024-02-06 临沂天耀箱包有限公司 基于图像特征的集成箱体质量检测方法

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893925A (zh) * 2015-12-01 2016-08-24 乐视致新电子科技(天津)有限公司 基于肤色的人手检测方法及装置
CN106599771B (zh) * 2016-10-21 2019-11-22 上海未来伙伴机器人有限公司 一种手势图像的识别方法及系统
CN107239727A (zh) * 2016-12-07 2017-10-10 北京深鉴智能科技有限公司 手势识别方法和系统
CN109857244B (zh) * 2017-11-30 2023-09-01 百度在线网络技术(北京)有限公司 一种手势识别方法、装置、终端设备、存储介质及vr眼镜
CN108121971B (zh) * 2017-12-25 2018-10-26 哈尔滨拓讯科技有限公司 一种基于动作时序特征的人手检测方法及装置
CN108596237B (zh) * 2018-04-19 2019-11-15 北京邮电大学 一种基于颜色和血管的lci激光内镜下的结肠息肉分类装置
CN109002825A (zh) * 2018-08-07 2018-12-14 成都睿码科技有限责任公司 基于视频分析的手部包扎绷带检测方法
CN109299730B (zh) * 2018-08-27 2022-03-04 鲁东大学 一种目标检测的分类模型建立方法、装置和目标检测设备
CN110991458B (zh) * 2019-11-25 2023-05-23 创新奇智(北京)科技有限公司 基于图像特征的人工智能识别结果抽样系统及抽样方法
CN111242936A (zh) * 2020-01-17 2020-06-05 苏州瓴图智能科技有限公司 一种基于图像的非接触式手掌疱疹检测装置及方法
CN111339315B (zh) * 2020-02-21 2023-05-02 南京星火技术有限公司 知识图谱构建方法、系统、计算机可读介质和电子设备
CN112734767A (zh) * 2020-12-28 2021-04-30 平安科技(深圳)有限公司 基于病理图像组织区域的提取方法、装置、设备及介质
CN113128435B (zh) * 2021-04-27 2022-11-22 南昌虚拟现实研究院股份有限公司 图像中手部区域分割方法、装置、介质及计算机设备
CN113240712A (zh) * 2021-05-11 2021-08-10 西北工业大学 一种基于视觉的水下集群邻居跟踪测量方法
CN113436097B (zh) * 2021-06-24 2022-08-02 湖南快乐阳光互动娱乐传媒有限公司 一种视频抠图方法、装置、存储介质和设备
CN113744263B (zh) * 2021-09-17 2023-10-27 景德镇陶瓷大学 一种小尺寸马赛克陶瓷表面缺陷快速检测的方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090169099A1 (en) * 2007-12-05 2009-07-02 Vestel Elektronik Sanayi Ve Ticaret A.S. Method of and apparatus for detecting and adjusting colour values of skin tone pixels
CN103106386A (zh) * 2011-11-10 2013-05-15 华为技术有限公司 动态自适应肤色分割方法和装置
CN103745193A (zh) * 2013-12-17 2014-04-23 小米科技有限责任公司 一种肤色检测方法及装置
US20140147035A1 (en) * 2011-04-11 2014-05-29 Dayaong Ding Hand gesture recognition system
US20140177955A1 (en) * 2012-12-21 2014-06-26 Sadagopan Srinivasan System and method for adaptive skin tone detection
CN105893925A (zh) * 2015-12-01 2016-08-24 乐视致新电子科技(天津)有限公司 基于肤色的人手检测方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101251898B (zh) * 2008-03-25 2010-09-15 腾讯科技(深圳)有限公司 一种肤色检测方法及装置
CN102968623B (zh) * 2012-12-07 2015-12-23 上海电机学院 肤色检测系统及方法
CN104318558B (zh) * 2014-10-17 2017-06-23 浙江大学 复杂场景下基于多信息融合的手势分割方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090169099A1 (en) * 2007-12-05 2009-07-02 Vestel Elektronik Sanayi Ve Ticaret A.S. Method of and apparatus for detecting and adjusting colour values of skin tone pixels
US20140147035A1 (en) * 2011-04-11 2014-05-29 Dayaong Ding Hand gesture recognition system
CN103106386A (zh) * 2011-11-10 2013-05-15 华为技术有限公司 动态自适应肤色分割方法和装置
US20140177955A1 (en) * 2012-12-21 2014-06-26 Sadagopan Srinivasan System and method for adaptive skin tone detection
CN103745193A (zh) * 2013-12-17 2014-04-23 小米科技有限责任公司 一种肤色检测方法及装置
CN105893925A (zh) * 2015-12-01 2016-08-24 乐视致新电子科技(天津)有限公司 基于肤色的人手检测方法及装置

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107563272A (zh) * 2017-06-14 2018-01-09 南京理工大学 一种无重叠视域监控系统中目标匹配方法
CN107563272B (zh) * 2017-06-14 2023-06-20 南京理工大学 一种无重叠视域监控系统中目标匹配方法
CN109977734A (zh) * 2017-12-28 2019-07-05 华为技术有限公司 图像处理方法和装置
CN109977734B (zh) * 2017-12-28 2023-06-06 华为技术有限公司 图像处理方法和装置
CN109583331A (zh) * 2018-11-15 2019-04-05 复旦大学 基于深度学习的人手腕脉口位置精准定位方法
CN109583331B (zh) * 2018-11-15 2023-05-02 复旦大学 基于深度学习的人手腕脉口位置精准定位方法
CN109784357A (zh) * 2018-11-19 2019-05-21 西安理工大学 一种基于统计模型的图像重拍检测方法
CN109784357B (zh) * 2018-11-19 2022-10-11 西安理工大学 一种基于统计模型的图像重拍检测方法
CN110232690A (zh) * 2019-06-05 2019-09-13 广东工业大学 一种图像分割的方法、系统、设备及计算机可读存储介质
CN110473177A (zh) * 2019-07-30 2019-11-19 上海媚测信息科技有限公司 皮肤色素分布预测方法、图像处理系统及存储介质
CN110473177B (zh) * 2019-07-30 2022-12-09 上海媚测信息科技有限公司 皮肤色素分布预测方法、图像处理系统及存储介质
CN110473191A (zh) * 2019-08-09 2019-11-19 深圳市三宝创新智能有限公司 一种红疹识别方法
CN110619637A (zh) * 2019-08-16 2019-12-27 上海吉汭泰网络科技有限公司 基于模板的服装图像多特征统计分割方法
CN110728286A (zh) * 2019-09-24 2020-01-24 西安理工大学 一种基于火花图像的砂带磨削材料去除率识别方法
CN110728286B (zh) * 2019-09-24 2023-02-10 西安理工大学 一种基于火花图像的砂带磨削材料去除率识别方法
CN111079637A (zh) * 2019-12-12 2020-04-28 武汉轻工大学 田间图像中分割油菜花的方法、装置、设备及存储介质
CN111079637B (zh) * 2019-12-12 2023-09-08 武汉轻工大学 田间图像中分割油菜花的方法、装置、设备及存储介质
CN111242052A (zh) * 2020-01-16 2020-06-05 成都唐源电气股份有限公司 一种接触网刚柔导线自动判别方法及装置
CN111242052B (zh) * 2020-01-16 2023-08-08 成都唐源电气股份有限公司 一种接触网刚柔导线自动判别方法及装置
CN111325728B (zh) * 2020-02-19 2023-05-30 南方科技大学 产品缺陷检测方法、装置、设备及存储介质
CN111325728A (zh) * 2020-02-19 2020-06-23 南方科技大学 产品缺陷检测方法、装置、设备及存储介质
CN111429535B (zh) * 2020-03-13 2023-09-08 深圳市雄帝科技股份有限公司 对图像中衣服与背景差异度评估方法、系统、设备及介质
CN111429535A (zh) * 2020-03-13 2020-07-17 深圳市雄帝科技股份有限公司 对图像中衣服与背景差异度评估方法、系统、设备及介质
CN111445466B (zh) * 2020-04-01 2023-05-05 山东浪潮科学研究院有限公司 一种螺栓防漏拧检测方法及设备、介质
CN111445466A (zh) * 2020-04-01 2020-07-24 济南浪潮高新科技投资发展有限公司 一种螺栓防漏拧检测方法及设备、介质
CN111754486A (zh) * 2020-06-24 2020-10-09 北京百度网讯科技有限公司 图像处理方法、装置、电子设备及存储介质
CN111754486B (zh) * 2020-06-24 2023-08-15 北京百度网讯科技有限公司 图像处理方法、装置、电子设备及存储介质
CN111881789A (zh) * 2020-07-14 2020-11-03 深圳数联天下智能科技有限公司 肤色识别方法、装置、计算设备及计算机存储介质
CN112150438B (zh) * 2020-09-23 2023-01-20 创新奇智(青岛)科技有限公司 断线检测方法、装置、电子设备及存储介质
CN112150438A (zh) * 2020-09-23 2020-12-29 创新奇智(青岛)科技有限公司 断线检测方法、装置、电子设备及存储介质
CN112446865A (zh) * 2020-11-25 2021-03-05 创新奇智(广州)科技有限公司 瑕疵识别方法、装置、设备和存储介质
CN113361483A (zh) * 2021-07-07 2021-09-07 合肥英睿系统技术有限公司 一种交通限速标志检测方法、装置、设备及存储介质
CN113812965A (zh) * 2021-08-19 2021-12-21 杭州回车电子科技有限公司 睡眠状态识别方法、装置、电子装置和存储介质
CN113812965B (zh) * 2021-08-19 2024-04-09 杭州回车电子科技有限公司 睡眠状态识别方法、装置、电子装置和存储介质
CN116503404A (zh) * 2023-06-27 2023-07-28 梁山县创新工艺品股份有限公司 塑料玩具质量检测方法、装置、电子设备及存储介质
CN116503404B (zh) * 2023-06-27 2023-09-01 梁山县创新工艺品股份有限公司 塑料玩具质量检测方法、装置、电子设备及存储介质
CN117437217A (zh) * 2023-12-18 2024-01-23 武汉博源新材料科技集团股份有限公司 一种基于图像识别的纸塑产品分拣方法及系统
CN117437217B (zh) * 2023-12-18 2024-03-08 武汉博源新材料科技集团股份有限公司 一种基于图像识别的纸塑产品分拣方法及系统
CN117522863A (zh) * 2023-12-29 2024-02-06 临沂天耀箱包有限公司 基于图像特征的集成箱体质量检测方法
CN117522863B (zh) * 2023-12-29 2024-03-29 临沂天耀箱包有限公司 基于图像特征的集成箱体质量检测方法

Also Published As

Publication number Publication date
CN105893925A (zh) 2016-08-24

Similar Documents

Publication Publication Date Title
WO2017092431A1 (fr) Procédé et dispositif de détection de main humaine basés sur une couleur de peau
WO2020207423A1 (fr) Procédé de détection de type de peau, procédé de classification de qualité de type de peau et appareil de détection de type de peau
US10372226B2 (en) Visual language for human computer interfaces
CN111488756B (zh) 基于面部识别的活体检测的方法、电子设备和存储介质
WO2017088365A1 (fr) Procédé et appareil de détection de couleur de peau
KR102449841B1 (ko) 타겟의 검측 방법 및 장치
Ajmal et al. A comparison of RGB and HSV colour spaces for visual attention models
WO2019100282A1 (fr) Procédé et dispositif de reconnaissance de couleur de peau de visage, et terminal intelligent
CN107507144B (zh) 肤色增强的处理方法、装置及图像处理装置
CN108090511B (zh) 图像分类方法、装置、电子设备及可读存储介质
CN104268590B (zh) 基于互补性组合特征与多相回归的盲图像质量评价方法
CN109308711A (zh) 目标检测方法、装置及图像处理设备
CN107506738A (zh) 特征提取方法、图像识别方法、装置及电子设备
CN111709305A (zh) 一种基于局部图像块的人脸年龄识别方法
CN108711160A (zh) 一种基于hsi增强性模型的目标分割方法
KR101334794B1 (ko) 특징정보를 이용하는 꽃 인식 장치 및 방법
KR101344851B1 (ko) 영상처리장치 및 영상처리방법
JP5615344B2 (ja) 色特徴を抽出するための方法および装置
CN107862681B (zh) 一种自拍图像质量推荐方法
CN106402717B (zh) 一种ar播放控制方法及智能台灯
KR100488014B1 (ko) YCrCb 칼라 기반 얼굴 영역 추출 방법
Shemshaki et al. Lip segmentation using geometrical model of color distribution
Shih et al. Multiskin color segmentation through morphological model refinement
Shen et al. A holistic image segmentation framework for cloud detection and extraction
CN111325209A (zh) 一种车牌识别方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16869742

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16869742

Country of ref document: EP

Kind code of ref document: A1