WO2018082308A1 - Image processing method and terminal - Google Patents

Image processing method and terminal Download PDF

Info

Publication number
WO2018082308A1
WO2018082308A1 PCT/CN2017/087702 CN2017087702W WO2018082308A1 WO 2018082308 A1 WO2018082308 A1 WO 2018082308A1 CN 2017087702 W CN2017087702 W CN 2017087702W WO 2018082308 A1 WO2018082308 A1 WO 2018082308A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
target
layers
image
group
Prior art date
Application number
PCT/CN2017/087702
Other languages
French (fr)
Chinese (zh)
Inventor
张兆丰
牟永强
Original Assignee
深圳云天励飞技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术有限公司 filed Critical 深圳云天励飞技术有限公司
Publication of WO2018082308A1 publication Critical patent/WO2018082308A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/164Detection; Localisation; Normalisation using holistic features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Definitions

  • the present invention relates to the field of image processing technologies, and in particular, to an image processing method and a terminal.
  • Face detection technology has been widely used in the field of video surveillance.
  • face detection is the first link, and its accuracy has a great impact on the performance of face recognition.
  • Face detection needs to be robust, because in practical applications, face images are affected by many factors, such as lighting, occlusion, and posture changes. Face detection is the most frequently invoked in the face recognition process and needs to be able to be executed efficiently.
  • Face detection technology mainly adopts features based on manual design, such as Haar feature, LBP (local binary mode histogram) feature, HOG (gradient direction histogram) feature, etc. The calculation time of these features is acceptable, in practical applications. In the prior art, the face detection calculation algorithm is more complicated, and thus the face detection efficiency is low.
  • Embodiments of the present invention provide an image processing method and a terminal, so as to quickly detect a face position.
  • a first aspect of the embodiments of the present invention provides an image processing method, including:
  • n is an integer greater than or equal to 1;
  • the calculating a number of layers of a feature pyramid of the to-be-processed image to obtain an n layer includes:
  • n is the number of layers of the feature pyramid
  • k up is a multiple of the sampled image to be processed
  • w img , h img respectively representing the width and height of the image to be processed
  • w m , h m respectively representing the width and height of the preset face detection model
  • n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
  • the constructing the feature pyramid based on the N layer includes:
  • the N layer comprises P real feature layers and Q approximate feature layers, wherein P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
  • the third target feature and the fourth target feature constitute the feature pyramid.
  • the determining the K target second target according to the K group first target feature Features including:
  • Calculating a pixel comparison feature for the i-th color feature training the first preset face model based on the calculated pixel comparison feature, and extracting the first target pixel comparison feature from the trained first preset face model, a fifth target feature, wherein the ith group color feature is any one of the K group color features;
  • the determining, by using the M specified decision trees, the second target feature of the K group , get the size and position of the target face frame including:
  • the size and position of the target face frame are merged.
  • a second aspect of the embodiments of the present invention provides a terminal, including:
  • An obtaining unit configured to acquire an image to be processed
  • a calculating unit configured to calculate a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, wherein n is an integer greater than or equal to 1;
  • a constructing unit configured to construct the feature pyramid based on the n layers
  • An extracting unit configured to perform feature extraction on the K preset detection windows on the feature pyramid to obtain the K group first target feature, wherein each set of the preset detection window corresponds to a group of first targets a characteristic, the K being an integer greater than or equal to 1;
  • a determining unit configured to determine the K group second target feature according to the K group first target feature
  • a decision unit configured to determine the size and location of the target face frame by using the M specified decision trees, where the M is an integer greater than or equal to 1.
  • the calculating unit is specifically configured to:
  • n is the number of layers of the feature pyramid
  • k up is a multiple of the sampled image to be processed
  • w img , h img respectively representing the width and height of the image to be processed
  • w m , h m respectively representing the width and height of the preset face detection model
  • n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
  • the constructing unit includes:
  • a first determining module configured to determine that the N layer includes P real feature layers and Q approximate feature layers, where P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
  • a first extraction module configured to perform feature extraction on the P real feature layers to obtain a third target feature
  • a second determining module configured to determine, according to the P real feature layers, the Q approximate feature layers Fourth target feature
  • a constructing module configured to form the third target feature and the fourth target feature into the feature pyramid.
  • the determining unit includes:
  • a second extraction module configured to separately extract color features from the K group first target features to obtain the K group color features
  • a first training module configured to calculate a pixel comparison feature for the i-th color feature, train the first preset face model based on the calculated pixel comparison feature, and extract the first preset face model from the training a target pixel comparison feature to obtain a fifth target feature, wherein the ith group color feature is any one of the K group color features;
  • a second training module configured to train a second preset face model by using the fifth target feature and the first target feature, and extract a second pixel comparison feature from the trained second preset face model , obtaining the sixth target feature;
  • a combination module configured to combine the first target feature and the sixth target feature into the second target feature.
  • the determining unit includes:
  • a decision module configured to determine, by using the M specified decision trees, the second target feature of the K group on the feature pyramid, to obtain an X personal face frame, where the X is an integer greater than or equal to 1;
  • a merging module configured to merge the size and position of the target face frame according to the X personal face frame.
  • the image to be processed is acquired, the number of layers of the feature pyramid of the image to be processed is calculated, and n layers are obtained, where n is an integer greater than or equal to 1.
  • the feature pyramid is constructed, and on the feature pyramid, Feature extraction is performed on K preset detection windows to obtain a first target feature of the K group, wherein each set of preset detection windows corresponds to a set of first target features, and K is an integer greater than or equal to 1, according to the K group first
  • the target feature determines the second target feature of the K group, and uses the M specified decision trees to make a decision on the second target feature of the K group, and obtains the size and position of the target face frame, where M is an integer greater than or equal to 1. Thereby, the face position can be detected quickly.
  • FIG. 1 is a schematic flowchart of an embodiment of an image processing method according to an embodiment of the present invention
  • FIG. 2a is a schematic structural diagram of a first embodiment of a terminal according to an embodiment of the present invention.
  • FIG. 2b is a schematic structural diagram of a structural unit of the terminal depicted in FIG. 2a according to an embodiment of the present invention
  • FIG. 2c is a schematic structural diagram of a determining unit of the terminal depicted in FIG. 2a according to an embodiment of the present invention
  • FIG. 2d is a schematic structural diagram of a determining unit of the terminal depicted in FIG. 2a according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a second embodiment of a terminal according to an embodiment of the present invention.
  • references to "an embodiment” herein mean that a particular feature, structure, or characteristic described in connection with the embodiments can be included in at least one embodiment of the invention.
  • the appearances of the phrases in various places in the specification are not necessarily referring to the same embodiments, and are not exclusive or alternative embodiments that are mutually exclusive. Those skilled in the art will understand and implicitly understand that the embodiments described herein can be combined with other embodiments.
  • the terminal described in the embodiment of the present invention may include a smart phone (such as an Android mobile phone, an iOS mobile phone, a Windows Phone mobile phone, etc.), a tablet computer, a palmtop computer, a notebook computer, and a mobile internet device. (MID, Mobile Internet Devices) or wearable devices, etc., the above terminals are merely examples, not exhaustive, including but not limited to the above terminals.
  • FIG. 1 is a schematic flowchart of an embodiment of an image processing method according to an embodiment of the present invention.
  • the image processing method described in this embodiment includes the following steps:
  • the image to be processed is an image including a human face.
  • the image to be processed includes at least one face.
  • the terminal can acquire the original image.
  • the original image is a grayscale image
  • the image needs to be converted into an RGB image, that is, the grayscale information of the original image is copied to the R channel, the G channel, and the B channel.
  • the original image is a color image
  • the original image is not an RGB image, it can be converted into an RGB image, and if the original image is an RGB image, it is directly taken as an image to be processed.
  • n is an integer greater than or equal to 1.
  • calculating the number of layers of the feature pyramid of the image to be processed to obtain the n layer may be implemented as follows:
  • n is the number of layers of the feature pyramid
  • k up is the multiple of the sampled image to be processed
  • w img and h img respectively represent the width and height of the image to be processed
  • w m and h m respectively preset the width of the face detection model.
  • height n octave refers to the number of layers of the image between every two dimensions in the feature pyramid.
  • the size thereof may be a known amount, and the size of the preset face model is also a known amount.
  • the above k up can be specified by the user, or the system defaults.
  • the above n octave can be specified by the user, or the system defaults.
  • the obtained feature may form a feature pyramid.
  • a Laplacian pyramid transform is performed on an image to be processed to obtain a feature pyramid.
  • the number of layers of the feature pyramid in the embodiment of the present invention is not specified by the user, but is calculated according to the size of the image to be processed and the size of the preset face detection model, and thus, the determined features of the image to be processed of different sizes are determined.
  • the number of layers of the pyramid is different, so that the number of layers of the feature pyramid determined by the embodiment of the present invention is more Appropriate to the size of the image.
  • At least one preset face detection model may be used in the embodiment of the present invention.
  • all preset face detection models may have the same size.
  • the constructing the feature pyramid based on the N layer may include the following steps:
  • N layer includes P real feature layers and Q approximate feature layers, where P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
  • the conventional method generally calculates the feature pyramid of the image first, and then calculates the corresponding feature based on each layer image of the feature pyramid.
  • a real feature layer only a small number of image layer features are calculated, which is called a real feature layer.
  • the features of other layer images are based on real feature interpolation and are called approximate feature layers.
  • the real feature layer in the pyramid is specified by the user or by default, and the other layers are approximate feature layers, which are obtained by interpolation of the real feature layer closest to the distance.
  • the feature extraction may be performed on the real feature layer in step 32, for example, extracting color features, gradient magnitude features, and direction histogram features.
  • the color features can be RGB, LUV, HSV, GRAY, gradient magnitude features, and direction histogram features corresponding to a special form of HOG features, ie, the number of cells in the block is one.
  • the color feature, the gradient magnitude feature, and the direction histogram feature may be referred to the prior art, and details are not described herein again.
  • the feature of the approximate feature layer can be calculated based on the real feature layer.
  • the approximate feature layer can be obtained by interpolation of the real feature layer.
  • the feature value needs to be multiplied by a coefficient.
  • the calculation method can refer to the following formula:
  • s refers to the ratio of the approximate feature layer to the real feature layer
  • ⁇ ⁇ is constant for one feature
  • the value of ⁇ ⁇ can be estimated in the following manner. Estimating, be replaced by a k ⁇ s k s, among them It refers to scaling the image I i by the scale s
  • f ⁇ (I) means finding the feature ⁇ for the image I, and averaging these features
  • N refers to the number of pictures participating in the estimation.
  • s is Take 50,000 and find ⁇ ⁇ by least squares method.
  • K is an integer greater than or equal to 1.
  • the preset detection window can be set by the system default or by the user.
  • the preset detection window can include a window size and a window position. Feature extraction is performed on each preset detection window in the K preset detection windows, and a set of first target features are respectively obtained, so that the K target first target feature is obtained, and the K is an integer greater than or equal to 1.
  • the position of the preset detection window and the size of the window are fixed.
  • one step can be moved in the x and y directions each time.
  • determining, according to the K group first target feature, the second target feature of the K group including:
  • the method for extracting pixel comparison features in the above steps 52 and 53 may refer to the following formula:
  • I represents the image I
  • l i , l j are the pixel points at different positions in the image I
  • I(l i ) respectively refer to the pixel values at the positions of l i and l j in the image I
  • compare The pixel value of I(l i ) and I(l j ) can be obtained as a comparison feature of two pixels.
  • the image to be processed can also be divided into The area bins that do not overlap each other, the size of the area is b ⁇ b, and the comparison feature in bins is defined as follows.
  • l i ⁇ bin i , l j ⁇ bin j , f cb refer to pixel comparison features of two different regions in the image to be processed.
  • the image to be processed is calculated pixel by pixel. Therefore, when the size of the model is fixed, it is not determined whether the feature is different because of the training process. Calculation.
  • the comparison features are different and depend on the model training process. In order to better fuse color, gradient magnitude, direction histogram features and pixel comparison features.
  • the first preset face model is trained using only the pixel comparison feature, and the size of the first preset face model is n ⁇ n pixels. Then, when training, there are (n/b) 2 ⁇ ((n/b) 2 -1)/2 comparison features. Training is performed using the adaboost method, which has a depth of 5 and a number of 500.
  • the pixel comparison features selected from the first preset face model will be greatly reduced, and the number of the pixel comparison features (ie, the fifth target feature) is controlled within 10000.
  • the second preset face model is trained in combination using the fifth target feature and the first target feature (ie, color feature, gradient magnitude, and direction histogram feature). Still using the adaboost method for training, the depth of the decision tree is 5, the number is 500, and the second pixel comparison feature is extracted from the trained second preset face model to obtain the sixth target feature;
  • first target feature and the sixth target feature are combined into a second target feature.
  • the present invention combines the use of the fused multi-channel feature and the pixel comparison feature, overcomes the problem that the position of the face frame is inaccurate when only the fused multi-channel feature is used, and further improves the detection rate of the face in the case of backlighting.
  • the embodiment of the present invention may adopt M designated decision trees, where M is an integer greater than or equal to 1, and the specified decision tree sends a second target feature in the preset detection window to make a decision on the second target feature. Get the score and accumulate the score. If the score is below a certain threshold, the window will be directly eliminated. If the score is higher than the threshold, continue to classify on the next decision tree, obtain the score and accumulate the score until all the decision trees are traversed, and convert the position coordinate, width and height information of the window to the image to be processed and output the face. Box, including the position and size of the face frame.
  • the determining, by using the M specified decision trees, the second target feature of the K group, and determining the size and location of the target face frame including:
  • step 61 Among them, step 61
  • the terminal may merge the face frames with overlapping positions by using a Non-Maximum Suppression (NMS) algorithm to output a final face frame.
  • NMS Non-Maximum Suppression
  • the image to be processed is obtained, the number of layers of the feature pyramid of the image to be processed is calculated, and n layers are obtained, where n is an integer greater than or equal to 1, and the feature pyramid is constructed based on the n layer.
  • feature extraction is performed on K preset detection windows to obtain a first target feature of the K group, wherein each set of preset detection windows corresponds to a set of first target features, and K is an integer greater than or equal to 1, according to
  • the first target feature of the K group determines the second target feature of the K group
  • the M target decision tree is used to determine the second target feature of the K group, and the size and position of the target face frame are obtained, where M is an integer greater than or equal to 1. .
  • FIG. 2 is a schematic structural diagram of a first embodiment of a terminal according to an embodiment of the present invention.
  • the terminal described in this embodiment includes: an obtaining unit 201, a calculating unit 202, a constructing unit 203, an extracting unit 204, a determining unit 205, and a determining unit 206, as follows:
  • An obtaining unit 201 configured to acquire an image to be processed
  • the calculating unit 202 is configured to calculate a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, where n is an integer greater than or equal to 1;
  • the constructing unit 203 is configured to construct the feature pyramid based on the n layer;
  • the extracting unit 204 is configured to perform feature extraction on the K preset detection windows on the feature pyramid to obtain the K group first target feature, where each group of the preset detection windows corresponds to a group of first a target feature, the K being an integer greater than or equal to 1;
  • a determining unit 205 configured to determine, according to the K group first target feature, the K group second target Levy
  • the determining unit 206 is configured to determine the size and location of the target face frame by using the M specified decision trees to obtain the size and position of the target face frame, where the M is an integer greater than or equal to 1.
  • the calculating unit 202 is specifically configured to:
  • n is the number of layers of the feature pyramid
  • k up is a multiple of the sampled image to be processed
  • w img , h img respectively representing the width and height of the image to be processed
  • w m , h m respectively representing the width and height of the preset face detection model
  • n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
  • the configuration unit of the terminal as described in FIG. 2b and FIG. 2b may include: a first determining module 2031, a first extracting module 2032, a second determining module 2033, and a constructing module 2034, as follows:
  • the first determining module 2031 is configured to determine that the N layer includes P real feature layers and Q approximate feature layers, where P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
  • a first extraction module 2032 configured to perform feature extraction on the P real feature layers to obtain a third target feature
  • a second determining module 2033 configured to determine, according to the P real feature layers, a fourth target feature of the Q approximate feature layers
  • the constructing module 2034 is configured to form the third target feature and the fourth target feature to form the feature pyramid.
  • the determining unit 205 of the terminal may include: a second extracting module 2051, a first training module 2052, a second training module 2053, and a combining module 2054, as follows:
  • a second extraction module 2051 configured to separately extract color features from the K group first target features to obtain the K group color features
  • the first training module 2052 is configured to calculate a pixel comparison feature for the i-th color feature, train the first preset face model based on the calculated pixel comparison feature, and extract the first preset face model from the training
  • the first target pixel compares the feature to obtain a fifth target feature, wherein the ith set of color features is any one of the K sets of color features;
  • a second training module 2053 configured to train a second preset face model by using the fifth target feature and the first target feature, and extracting a second pixel comparison from the trained second preset face model Feature, obtaining a sixth target feature;
  • the combining module 2054 is configured to combine the first target feature and the sixth target feature into the second target feature.
  • the decision unit 206 of the terminal as described in FIG. 2d and FIG. 2a may include: a decision module 2061 and a merge module 2062, as follows:
  • the decision module 2061 is configured to determine, by using the M specified decision trees, the second target feature of the K group on the feature pyramid to obtain an X personal face frame, where the X is an integer greater than or equal to 1;
  • the merging module 2062 is configured to merge the size and position of the target face frame according to the X personal face frame.
  • the image to be processed is acquired, and the number of layers of the feature pyramid of the image to be processed is calculated to obtain an n layer, where n is an integer greater than or equal to 1, and the n layer is constructed.
  • Feature pyramid on the feature pyramid, feature extraction of K preset detection windows to obtain K group first target features, wherein each set of preset detection windows corresponds to a set of first target features, K is greater than or equal to 1
  • the integer is determined according to the first target feature of the K group, and the second target feature of the K group is determined by the M specified decision trees, and the size and position of the target face frame are obtained, wherein M is greater than or An integer equal to 1.
  • FIG. 3 it is a schematic structural diagram of a second embodiment of a terminal according to an embodiment of the present invention.
  • the terminal described in this embodiment includes: at least one input device 1000; at least one output device 2000; at least one processor 3000, such as a CPU; and a memory 4000, the input device 1000, the output device 2000, the processor 3000, and the memory 4000 is connected via bus 5000.
  • the input device 1000 may be a touch panel, a physical button, or a mouse.
  • the output device 2000 described above may specifically be a display screen.
  • the above memory 4000 may be a high speed RAM memory or a non-volatile memory such as a magnetic disk memory.
  • the above memory 4000 is used to store a set of program codes, and the input device 1000, the output device 2000, and the processor 3000 are used to call the memory 4000.
  • the program code stored in do the following:
  • the processor 3000 is configured to:
  • n is an integer greater than or equal to 1;
  • the processor 3000 calculates the number of layers of the feature pyramid of the image to be processed, and obtains n layers, including:
  • n is the number of layers of the feature pyramid
  • k up is a multiple of the sampled image to be processed
  • w img , h img respectively representing the width and height of the image to be processed
  • w m , h m respectively representing the width and height of the preset face detection model
  • n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
  • the foregoing processor 3000 constructs the feature pyramid based on the N layer, including:
  • the N layer comprises P real feature layers and Q approximate feature layers, wherein P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
  • the third target feature and the fourth target feature constitute the feature pyramid.
  • the processor 3000 determines, according to the K group first target feature, the K group second target feature, including:
  • Calculating a pixel comparison feature for the i-th color feature training the first preset face model based on the calculated pixel comparison feature, and extracting the first target pixel comparison feature from the trained first preset face model, a fifth target feature, wherein the ith group color feature is any one of the K group color features;
  • the processor 3000 determines, by using the M specified decision trees, the second target feature of the K group, and obtains the size and location of the target face frame, including:
  • the size and position of the target face frame are merged.
  • the embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium can store a program, and the program includes some or all of the steps of any one of the image processing methods described in the foregoing method embodiments.
  • embodiments of the present invention can be provided as a method, apparatus (device), or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • the computer program is stored/distributed in a suitable medium, provided with other hardware or as part of the hardware, or in other distributed forms, such as over the Internet or other wired or wireless telecommunication systems.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Abstract

Provided are an image processing method and a terminal. The method comprises: acquiring an image to be processed; calculating the number of layers of a feature pyramid of the image to be processed so as to obtain n layers, n being an integer greater than or equal to 1; constructing the feature pyramid based on the n layers; performing feature extraction on K pre-set detection windows on the feature pyramid so as to obtain K groups of first target features, wherein each group of pre-set detection windows corresponds to one group of first target features, K being an integer greater than or equal to 1; determining K groups of second target features according to the K groups of first target features; and making a decision on the K groups of second target features by using M specified decision trees so as to obtain the size and position of a target face frame, M being an integer greater than or equal to 1. The position of a face can be quickly detected.

Description

一种图像处理方法及终端Image processing method and terminal 技术领域Technical field
本发明涉及图像处理技术领域,具体涉及一种图像处理方法及终端。The present invention relates to the field of image processing technologies, and in particular, to an image processing method and a terminal.
背景技术Background technique
随着信息技术的快速发展,人脸识别技术在视频监控领域得到了广泛应用。在人脸识别应用领域,人脸检测作为第一个环节,其准确性对人脸识别的性能有很大影响。人脸检测需要具有很强的鲁棒性,因为在实际应用中,人脸图片会受到多种因素的影响,例如光照、遮挡、姿态变化等。人脸检测在人脸识别过程调用的频次最高,需要能够被高效地执行。人脸检测技术主要采用基于手工设计的特征实现,例如Haar特征、LBP(局部二值模式直方图)特征、HOG(梯度方向直方图)特征等,这些特征的计算时间可接受,在实际的应用中也能取得较为满意的结果,因而得到广泛的应用,但是,现有技术中,人脸检测计算算法较为复杂,因而,人脸检测效率较低。With the rapid development of information technology, face recognition technology has been widely used in the field of video surveillance. In the field of face recognition applications, face detection is the first link, and its accuracy has a great impact on the performance of face recognition. Face detection needs to be robust, because in practical applications, face images are affected by many factors, such as lighting, occlusion, and posture changes. Face detection is the most frequently invoked in the face recognition process and needs to be able to be executed efficiently. Face detection technology mainly adopts features based on manual design, such as Haar feature, LBP (local binary mode histogram) feature, HOG (gradient direction histogram) feature, etc. The calculation time of these features is acceptable, in practical applications. In the prior art, the face detection calculation algorithm is more complicated, and thus the face detection efficiency is low.
发明内容Summary of the invention
本发明实施例提供了一种图像处理方法及终端,以期快速检测到人脸位置。Embodiments of the present invention provide an image processing method and a terminal, so as to quickly detect a face position.
本发明实施例第一方面提供了一种图像处理方法,包括:A first aspect of the embodiments of the present invention provides an image processing method, including:
获取待处理图像;Get the image to be processed;
计算所述待处理图像的特征金字塔的层数,得到n层,所述n为大于或等于1的整数;Calculating a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, wherein n is an integer greater than or equal to 1;
基于所述n层,构造所述特征金字塔;Constructing the feature pyramid based on the n layers;
在所述特征金字塔上,对K个预设检测窗口进行特征提取,得到所述K组第一目标特征,其中,每一组所述预设检测窗口对应一组第一目标特征,所述K为大于或等于1的整数;Performing feature extraction on the K preset detection windows to obtain the K group first target features, wherein each set of the preset detection windows corresponds to a set of first target features, the K An integer greater than or equal to 1;
根据所述K组第一目标特征确定所述K组第二目标特征;Determining the K group second target feature according to the K group first target feature;
采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,所述M为大于或等于1的整数。 Determining the K target second target feature by using M designated decision trees, and obtaining the size and position of the target human face frame, wherein the M is an integer greater than or equal to 1.
结合第一方面,在第一方面的第一种可能的实施方式中,所述计算所述待处理图像的特征金字塔的层数,得到n层,包括:With reference to the first aspect, in a first possible implementation manner of the first aspect, the calculating a number of layers of a feature pyramid of the to-be-processed image to obtain an n layer includes:
根据所述待处理图像的尺寸和预设人脸检测模型的尺寸计算特征金字塔的层数,如下公式所示:Calculating the number of layers of the feature pyramid according to the size of the image to be processed and the size of the preset face detection model, as shown in the following formula:
Figure PCTCN2017087702-appb-000001
Figure PCTCN2017087702-appb-000001
其中,n表示所述特征金字塔的层数,kup是所述待处理图像上采样的倍数,wimg、himg分别表示所述待处理图像的宽度和高度,wm、hm分别所述预设人脸检测模型的宽度和高度,noctave指所述特征金字塔中每两倍尺寸之间的图像的层数。Where n is the number of layers of the feature pyramid, k up is a multiple of the sampled image to be processed, w img , h img respectively representing the width and height of the image to be processed, w m , h m respectively The width and height of the preset face detection model, n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
结合第一方面或第一方面的第一种可能的实施方式,在第一方面的第二种可能的实施方式中,所述基于所述N层,构造所述特征金字塔,包括:With reference to the first aspect, or the first possible implementation manner of the first aspect, in the second possible implementation manner of the first aspect, the constructing the feature pyramid based on the N layer includes:
确定所述N层包含P个实特征层和Q个近似特征层,所述P为大于或等于1的整数,所述Q为大于或等于0的整数;Determining that the N layer comprises P real feature layers and Q approximate feature layers, wherein P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
对所述P个实特征层进行特征提取,得到第三目标特征;Performing feature extraction on the P real feature layers to obtain a third target feature;
根据所述P个实特征层,确定所述Q个近似特征层的第四目标特征;Determining, according to the P real feature layers, a fourth target feature of the Q approximate feature layers;
将所述第三目标特征和所述第四目标特征构成所述特征金字塔。The third target feature and the fourth target feature constitute the feature pyramid.
结合第一方面或第一方面的第一种可能的实施方式,在第一方面的第三种可能的实施方式中,所述根据所述K组第一目标特征确定所述K组第二目标特征,包括:With reference to the first aspect or the first possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the determining the K target second target according to the K group first target feature Features, including:
从所述K组第一目标特征中分别提取颜色特征,得到所述K组颜色特征;Extracting color features from the K group first target features to obtain the K group color features;
对第i组颜色特征计算像素比较特征,基于所述计算像素比较特征训练第一预设人脸模型,并从训练后的所述第一预设人脸模型提取第一目标像素比较特征,得到第五目标特征,其中,所述第i组颜色特征为所述K组颜色特征中的任一组颜色特征;Calculating a pixel comparison feature for the i-th color feature, training the first preset face model based on the calculated pixel comparison feature, and extracting the first target pixel comparison feature from the trained first preset face model, a fifth target feature, wherein the ith group color feature is any one of the K group color features;
通过所述第五目标特征和所述第一目标特征训练第二预设人脸模型,并从训练后的所述第二预设人脸模型提取第二像素比较特征,得到第六目标特征;Training a second preset face model by using the fifth target feature and the first target feature, and extracting a second pixel comparison feature from the trained second preset face model to obtain a sixth target feature;
将所述第一目标特征和所述第六目标特征组合为所述第二目标特征。Combining the first target feature and the sixth target feature into the second target feature.
结合第一方面或第一方面的第一种可能的实施方式,在第一方面的第四种可能的实施方式中,所述采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,包括: With reference to the first aspect or the first possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the determining, by using the M specified decision trees, the second target feature of the K group , get the size and position of the target face frame, including:
在所述特征金字塔上,采用M个指定决策树对所述K组第二目标特征进行决策,得到X个人脸框,其中,所述X为大于或等于1的整数;Determining, by the M specified decision trees, the K target second target features on the feature pyramid to obtain an X personal face frame, wherein the X is an integer greater than or equal to 1;
根据所述X个人脸框合并为所述目标人脸框的大小和位置。According to the X personal face frame, the size and position of the target face frame are merged.
本发明实施例第二方面提供了一种终端,包括:A second aspect of the embodiments of the present invention provides a terminal, including:
获取单元,用于获取待处理图像;An obtaining unit, configured to acquire an image to be processed;
计算单元,用于计算所述待处理图像的特征金字塔的层数,得到n层,所述n为大于或等于1的整数;a calculating unit, configured to calculate a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, wherein n is an integer greater than or equal to 1;
构造单元,用于基于所述n层,构造所述特征金字塔;a constructing unit configured to construct the feature pyramid based on the n layers;
提取单元,用于在所述特征金字塔上,对K个预设检测窗口进行特征提取,得到所述K组第一目标特征,其中,每一组所述预设检测窗口对应一组第一目标特征,所述K为大于或等于1的整数;An extracting unit, configured to perform feature extraction on the K preset detection windows on the feature pyramid to obtain the K group first target feature, wherein each set of the preset detection window corresponds to a group of first targets a characteristic, the K being an integer greater than or equal to 1;
确定单元,用于根据所述K组第一目标特征确定所述K组第二目标特征;a determining unit, configured to determine the K group second target feature according to the K group first target feature;
决策单元,用于采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,所述M为大于或等于1的整数。And a decision unit, configured to determine the size and location of the target face frame by using the M specified decision trees, where the M is an integer greater than or equal to 1.
结合第二方面,在第二方面的第一种可能的实施方式中,所述计算单元具体用于:With reference to the second aspect, in a first possible implementation manner of the second aspect, the calculating unit is specifically configured to:
根据所述待处理图像的尺寸和预设人脸检测模型的尺寸计算特征金字塔的层数,如下公式所示:Calculating the number of layers of the feature pyramid according to the size of the image to be processed and the size of the preset face detection model, as shown in the following formula:
Figure PCTCN2017087702-appb-000002
Figure PCTCN2017087702-appb-000002
其中,n表示所述特征金字塔的层数,kup是所述待处理图像上采样的倍数,wimg、himg分别表示所述待处理图像的宽度和高度,wm、hm分别所述预设人脸检测模型的宽度和高度,noctave指所述特征金字塔中每两倍尺寸之间的图像的层数。Where n is the number of layers of the feature pyramid, k up is a multiple of the sampled image to be processed, w img , h img respectively representing the width and height of the image to be processed, w m , h m respectively The width and height of the preset face detection model, n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
结合第二方面或第二方面的第一种可能的实施方式,在第二方面的第二种可能的实施方式中,所述构造单元包括:In conjunction with the second aspect, or the first possible implementation of the second aspect, in the second possible implementation of the second aspect, the constructing unit includes:
第一确定模块,用于确定所述N层包含P个实特征层和Q个近似特征层,所述P为大于或等于1的整数,所述Q为大于或等于0的整数;a first determining module, configured to determine that the N layer includes P real feature layers and Q approximate feature layers, where P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
第一提取模块,用于对所述P个实特征层进行特征提取,得到第三目标特征;a first extraction module, configured to perform feature extraction on the P real feature layers to obtain a third target feature;
第二确定模块,用于根据所述P个实特征层,确定所述Q个近似特征层的 第四目标特征;a second determining module, configured to determine, according to the P real feature layers, the Q approximate feature layers Fourth target feature;
构造模块,用于将所述第三目标特征和所述第四目标特征构成所述特征金字塔。And a constructing module, configured to form the third target feature and the fourth target feature into the feature pyramid.
结合第二方面或第二方面的第一种可能的实施方式,在第二方面的第三种可能的实施方式中,所述确定单元包括:With reference to the second aspect or the first possible implementation manner of the second aspect, in the third possible implementation manner of the second aspect, the determining unit includes:
第二提取模块,用于从所述K组第一目标特征中分别提取颜色特征,得到所述K组颜色特征;a second extraction module, configured to separately extract color features from the K group first target features to obtain the K group color features;
第一训练模块,用于对第i组颜色特征计算像素比较特征,基于所述计算像素比较特征训练第一预设人脸模型,并从训练后的所述第一预设人脸模型提取第一目标像素比较特征,得到第五目标特征,其中,所述第i组颜色特征为所述K组颜色特征中的任一组颜色特征;a first training module, configured to calculate a pixel comparison feature for the i-th color feature, train the first preset face model based on the calculated pixel comparison feature, and extract the first preset face model from the training a target pixel comparison feature to obtain a fifth target feature, wherein the ith group color feature is any one of the K group color features;
第二训练模块,用于通过所述第五目标特征和所述第一目标特征训练第二预设人脸模型,并从训练后的所述第二预设人脸模型提取第二像素比较特征,得到第六目标特征;a second training module, configured to train a second preset face model by using the fifth target feature and the first target feature, and extract a second pixel comparison feature from the trained second preset face model , obtaining the sixth target feature;
组合模块,用于将所述第一目标特征和所述第六目标特征组合为所述第二目标特征。And a combination module, configured to combine the first target feature and the sixth target feature into the second target feature.
结合第二方面或第二方面的第一种可能的实施方式,在第二方面的第四种可能的实施方式中,所述决策单元包括:With reference to the second aspect, or the first possible implementation manner of the second aspect, in the fourth possible implementation manner of the second aspect, the determining unit includes:
决策模块,用于在所述特征金字塔上,采用M个指定决策树对所述K组第二目标特征进行决策,得到X个人脸框,其中,所述X为大于或等于1的整数;a decision module, configured to determine, by using the M specified decision trees, the second target feature of the K group on the feature pyramid, to obtain an X personal face frame, where the X is an integer greater than or equal to 1;
合并模块,用于根据所述X个人脸框合并为所述目标人脸框的大小和位置。And a merging module, configured to merge the size and position of the target face frame according to the X personal face frame.
实施本发明实施例,具有如下有益效果:Embodiments of the present invention have the following beneficial effects:
通过本发明实施例,获取待处理图像,计算待处理图像的特征金字塔的层数,得到n层,n为大于或等于1的整数,基于n层,构造所述特征金字塔,在特征金字塔上,对K个预设检测窗口进行特征提取,得到K组第一目标特征,其中,每一组预设检测窗口对应一组第一目标特征,K为大于或等于1的整数,根据K组第一目标特征确定K组第二目标特征,采用M个指定决策树对K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,M为大于或等于1的整数。从而,可快速检测到人脸位置。 According to the embodiment of the present invention, the image to be processed is acquired, the number of layers of the feature pyramid of the image to be processed is calculated, and n layers are obtained, where n is an integer greater than or equal to 1. Based on the n layer, the feature pyramid is constructed, and on the feature pyramid, Feature extraction is performed on K preset detection windows to obtain a first target feature of the K group, wherein each set of preset detection windows corresponds to a set of first target features, and K is an integer greater than or equal to 1, according to the K group first The target feature determines the second target feature of the K group, and uses the M specified decision trees to make a decision on the second target feature of the K group, and obtains the size and position of the target face frame, where M is an integer greater than or equal to 1. Thereby, the face position can be detected quickly.
附图说明DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are some embodiments of the present invention, Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1是本发明实施例提供的一种图像处理方法的实施例流程示意图;1 is a schematic flowchart of an embodiment of an image processing method according to an embodiment of the present invention;
图2a是本发明实施例提供的一种终端的第一实施例结构示意图;2a is a schematic structural diagram of a first embodiment of a terminal according to an embodiment of the present invention;
图2b是本发明实施例提供的图2a所描述的终端的构造单元的结构示意图;2b is a schematic structural diagram of a structural unit of the terminal depicted in FIG. 2a according to an embodiment of the present invention;
图2c是本发明实施例提供的图2a所描述的终端的确定单元的结构示意图;2c is a schematic structural diagram of a determining unit of the terminal depicted in FIG. 2a according to an embodiment of the present invention;
图2d是本发明实施例提供的图2a所描述的终端的确定单元的结构示意图;2d is a schematic structural diagram of a determining unit of the terminal depicted in FIG. 2a according to an embodiment of the present invention;
图3是本发明实施例提供的一种终端的第二实施例结构示意图。FIG. 3 is a schematic structural diagram of a second embodiment of a terminal according to an embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
本发明的说明书和权利要求书及所述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third", and "fourth" and the like in the specification and claims of the present invention are used to distinguish different objects, and are not intended to describe a specific order. . Furthermore, the terms "comprises" and "comprising" and "comprising" are intended to cover a non-exclusive inclusion. For example, a process, method, system, product, or device that comprises a series of steps or units is not limited to the listed steps or units, but optionally also includes steps or units not listed, or alternatively Other steps or units inherent to these processes, methods, products or equipment.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本发明的至少一个实施例中。在说明书中的各个位置展示该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。References to "an embodiment" herein mean that a particular feature, structure, or characteristic described in connection with the embodiments can be included in at least one embodiment of the invention. The appearances of the phrases in various places in the specification are not necessarily referring to the same embodiments, and are not exclusive or alternative embodiments that are mutually exclusive. Those skilled in the art will understand and implicitly understand that the embodiments described herein can be combined with other embodiments.
本发明实施例所描述的终端可以包括智能手机(如Android手机、iOS手机、Windows Phone手机等)、平板电脑、掌上电脑、笔记本电脑、移动互联网设备 (MID,Mobile Internet Devices)或穿戴式设备等,上述终端仅是举例,而非穷举,包含但不限于上述终端。The terminal described in the embodiment of the present invention may include a smart phone (such as an Android mobile phone, an iOS mobile phone, a Windows Phone mobile phone, etc.), a tablet computer, a palmtop computer, a notebook computer, and a mobile internet device. (MID, Mobile Internet Devices) or wearable devices, etc., the above terminals are merely examples, not exhaustive, including but not limited to the above terminals.
请参阅图1,为本发明实施例提供的一种图像处理方法的实施例流程示意图。本实施例中所描述的图像处理方法,包括以下步骤:FIG. 1 is a schematic flowchart of an embodiment of an image processing method according to an embodiment of the present invention. The image processing method described in this embodiment includes the following steps:
101、获取待处理图像。101. Acquire an image to be processed.
其中,待处理图像为包含人脸的图像,当然,待处理图像至少包含一个人脸。The image to be processed is an image including a human face. Of course, the image to be processed includes at least one face.
可选地,终端可获取原始图像,若该原始图像为灰度图像,则需将图像转化成RGB图像,即将原始图像的灰度信息,复制到R通道、G通道和B通道上。当然,若原始图像为彩色图像,若该原始图像不是RGB图像,可将其转化为RGB图像,若该原始图像为RGB图像,直接将其作为待处理图像。Optionally, the terminal can acquire the original image. If the original image is a grayscale image, the image needs to be converted into an RGB image, that is, the grayscale information of the original image is copied to the R channel, the G channel, and the B channel. Of course, if the original image is a color image, if the original image is not an RGB image, it can be converted into an RGB image, and if the original image is an RGB image, it is directly taken as an image to be processed.
102、计算所述待处理图像的特征金字塔的层数,得到n层,所述n为大于或等于1的整数。102. Calculate a number of layers of the feature pyramid of the image to be processed to obtain an n layer, where n is an integer greater than or equal to 1.
可选地,上述计算所述待处理图像的特征金字塔的层数,得到n层,可按照如下方式实施:Optionally, calculating the number of layers of the feature pyramid of the image to be processed to obtain the n layer may be implemented as follows:
根据所述待处理图像的尺寸和预设人脸检测模型的尺寸计算特征金字塔的层数,如下公式所示:Calculating the number of layers of the feature pyramid according to the size of the image to be processed and the size of the preset face detection model, as shown in the following formula:
Figure PCTCN2017087702-appb-000003
Figure PCTCN2017087702-appb-000003
其中,n表示特征金字塔的层数,kup是待处理图像上采样的倍数,wimg、himg分别表示待处理图像的宽度和高度,wm、hm分别预设人脸检测模型的宽度和高度,noctave指特征金字塔中每两倍尺寸之间的图像的层数。其中,在确定了待处理图像之后,其尺寸可为已知量,而预设人脸模型的尺寸也为已知量。上述kup可由用户指定,或者系统默认。上述noctave可由用户指定,或者系统默认。Where n is the number of layers of the feature pyramid, k up is the multiple of the sampled image to be processed, w img and h img respectively represent the width and height of the image to be processed, and w m and h m respectively preset the width of the face detection model. And height, n octave refers to the number of layers of the image between every two dimensions in the feature pyramid. Wherein, after the image to be processed is determined, the size thereof may be a known amount, and the size of the preset face model is also a known amount. The above k up can be specified by the user, or the system defaults. The above n octave can be specified by the user, or the system defaults.
可选地,当在对待处理图像进行特征提取后,得到的特征可形成特征金字塔。例如,对待处理图像进行拉普拉斯金字塔变换,可得到特征金字塔。而本发明实施例中的特征金字塔的层数并非由用户指定,而是根据待处理图像的尺寸和预设人脸检测模型的尺寸计算得到,因而,不同尺寸的待处理图像,其确定的特征金字塔的层数不一样,从而,本发明实施例确定的特征金字塔的层数更加 贴切于图像的尺寸。Alternatively, when feature extraction is performed on the image to be processed, the obtained feature may form a feature pyramid. For example, a Laplacian pyramid transform is performed on an image to be processed to obtain a feature pyramid. However, the number of layers of the feature pyramid in the embodiment of the present invention is not specified by the user, but is calculated according to the size of the image to be processed and the size of the preset face detection model, and thus, the determined features of the image to be processed of different sizes are determined. The number of layers of the pyramid is different, so that the number of layers of the feature pyramid determined by the embodiment of the present invention is more Appropriate to the size of the image.
当然,本发明实施例中可使用至少一个预设人脸检测模型,在预设人脸检测模型的个数为多个时,则所有的预设人脸检测模型的尺寸可一样。Certainly, at least one preset face detection model may be used in the embodiment of the present invention. When the number of preset face detection models is multiple, all preset face detection models may have the same size.
103、基于所述N层,构造所述特征金字塔。103. Construct the feature pyramid based on the N layer.
可选地,上述基于所述N层,构造所述特征金字塔,可包括如下步骤:Optionally, the constructing the feature pyramid based on the N layer may include the following steps:
31)、确定所述N层包含P个实特征层和Q个近似特征层,所述P为大于或等于1的整数,所述Q为大于或等于0的整数;31) determining that the N layer includes P real feature layers and Q approximate feature layers, where P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
32)、对所述P个实特征层进行特征提取,得到第三目标特征;32) performing feature extraction on the P real feature layers to obtain a third target feature;
33)、根据所述P个实特征层,确定所述Q个近似特征层的第四目标特征;33) determining, according to the P real feature layers, a fourth target feature of the Q approximate feature layers;
34)、将所述第三目标特征和所述第四目标特征构成所述特征金字塔。34) constituting the third target feature and the fourth target feature to form the feature pyramid.
需要说明的是,本发明实施例中,与传统人脸检测方法不同的是,传统方法,一般是先计算图像的特征金字塔,再基于特征金字塔的每层图像,计算相应的特征。本发明中,仅计算少量图像层的特征,称作实特征层。其他层图像的特征,是基于实特征插值得出的,称作近似特征层。由用户指定或者系统默认指定金字塔中的实特征层,其他的层则为近似特征层,它们由与其距离最近的实特征层插值得到。It should be noted that, in the embodiment of the present invention, unlike the conventional face detection method, the conventional method generally calculates the feature pyramid of the image first, and then calculates the corresponding feature based on each layer image of the feature pyramid. In the present invention, only a small number of image layer features are calculated, which is called a real feature layer. The features of other layer images are based on real feature interpolation and are called approximate feature layers. The real feature layer in the pyramid is specified by the user or by default, and the other layers are approximate feature layers, which are obtained by interpolation of the real feature layer closest to the distance.
其中,步骤32中可对实特征层进行特征提取,例如,提取颜色特征、梯度幅值特征、方向直方图特征。颜色特征可以为RGB、LUV、HSV、GRAY,梯度幅值特征、方向直方图特征相当于HOG特征的一种特殊形式,即block中cell的数目为1。具体地,提取颜色特征、梯度幅值特征、方向直方图特征可参考现有技术,在此不再赘述。The feature extraction may be performed on the real feature layer in step 32, for example, extracting color features, gradient magnitude features, and direction histogram features. The color features can be RGB, LUV, HSV, GRAY, gradient magnitude features, and direction histogram features corresponding to a special form of HOG features, ie, the number of cells in the block is one. For example, the color feature, the gradient magnitude feature, and the direction histogram feature may be referred to the prior art, and details are not described herein again.
其中,步骤33中可基于实特征层,计算近似特征层的特征。近似特征层可由实特征层插值得到,插值时需要将特征值乘以一个系数,其计算方法可参照如下公式:Wherein, in step 33, the feature of the approximate feature layer can be calculated based on the real feature layer. The approximate feature layer can be obtained by interpolation of the real feature layer. When interpolating, the feature value needs to be multiplied by a coefficient. The calculation method can refer to the following formula:
Figure PCTCN2017087702-appb-000004
Figure PCTCN2017087702-appb-000004
其中,s指近似特征层相对于实特征层的比例,λΩ对一种特征来说为常数,可以采用以下方式估计λΩ的值。估计时,由kμs来代替ks
Figure PCTCN2017087702-appb-000005
其中
Figure PCTCN2017087702-appb-000006
指对图像Ii按比例s进行缩放,fμΩ(I)指对图像I求特征Ω,并将这些特征取平均,N指参与估计的图片数目。本发明中,将s为
Figure PCTCN2017087702-appb-000007
取50000,利用最小二乘法求得λΩ
Where s refers to the ratio of the approximate feature layer to the real feature layer, and λ Ω is constant for one feature, and the value of λ Ω can be estimated in the following manner. Estimating, be replaced by a k μs k s,
Figure PCTCN2017087702-appb-000005
among them
Figure PCTCN2017087702-appb-000006
It refers to scaling the image I i by the scale s, f μΩ (I) means finding the feature Ω for the image I, and averaging these features, and N refers to the number of pictures participating in the estimation. In the present invention, s is
Figure PCTCN2017087702-appb-000007
Take 50,000 and find λ Ω by least squares method.
104、在所述特征金字塔上,对K个预设检测窗口进行特征提取,得到所述K组第一目标特征,其中,每一组所述预设检测窗口对应一组第一目标特征,所述K为大于或等于1的整数。104. Perform feature extraction on the K preset detection windows on the feature pyramid to obtain the K group first target feature, where each set of the preset detection window corresponds to a set of first target features. K is an integer greater than or equal to 1.
其中,预设检测窗口可由系统默认或者用户自行设置。预设检测窗口可包括窗口大小和窗口位置。对K个预设检测窗口中每一预设检测窗口进行特征提取,可分别得到一组第一目标特征,于是,可得到K组第一目标特征,上述K为大于或等于1的整数。The preset detection window can be set by the system default or by the user. The preset detection window can include a window size and a window position. Feature extraction is performed on each preset detection window in the K preset detection windows, and a set of first target features are respectively obtained, so that the K target first target feature is obtained, and the K is an integer greater than or equal to 1.
可选地,在上述特征金字塔上,预设检测窗口的位置、窗口的大小是固定的,在特征提取过程中,每次可沿x、y方向移动一个步长。Optionally, on the feature pyramid, the position of the preset detection window and the size of the window are fixed. In the feature extraction process, one step can be moved in the x and y directions each time.
105、根据所述K组第一目标特征确定所述K组第二目标特征。105. Determine, according to the K group first target feature, the K group second target feature.
可选地,上述根据所述K组第一目标特征确定所述K组第二目标特征,包括:Optionally, determining, according to the K group first target feature, the second target feature of the K group, including:
51)、从所述K组第一目标特征中分别提取颜色特征,得到所述K组颜色特征;51) extracting color features from the first target features of the K group to obtain the K group color features;
52)、对第i组颜色特征计算像素比较特征,基于所述计算像素比较特征训练第一预设人脸模型,并从训练后的所述第一预设人脸模型提取第一目标像素比较特征,得到第五目标特征,其中,所述第i组颜色特征为所述K组颜色特征中的任一组颜色特征;52) calculating a pixel comparison feature for the i-th color feature, training the first preset face model based on the calculated pixel comparison feature, and extracting the first target pixel from the trained first preset face model Feature, obtaining a fifth target feature, wherein the ith set of color features is any one of the K sets of color features;
53)、通过所述第五目标特征和所述第一目标特征训练第二预设人脸模型,并从训练后的所述第二预设人脸模型提取第二像素比较特征,得到第六目标特征;53) training a second preset face model by using the fifth target feature and the first target feature, and extracting a second pixel comparison feature from the trained second preset face model to obtain a sixth Target feature
54)、将所述第一目标特征和所述第六目标特征组合为所述第二目标特征。54) Combining the first target feature and the sixth target feature into the second target feature.
其中,上述步骤52和步骤53中提取像素比较特征的方法可参考如下公式:The method for extracting pixel comparison features in the above steps 52 and 53 may refer to the following formula:
Figure PCTCN2017087702-appb-000008
Figure PCTCN2017087702-appb-000008
其中,I表示图像I,li、lj为图像I中不同位置的像素点,I(li)、I(lj)分别指图像I中li、lj位置处的像素值,比较I(li)、I(lj)的像素值大小即可得到两像素的比较特征。Where I represents the image I, l i , l j are the pixel points at different positions in the image I, and I(l i ), I(l j ) respectively refer to the pixel values at the positions of l i and l j in the image I, and compare The pixel value of I(l i ) and I(l j ) can be obtained as a comparison feature of two pixels.
可选地,为了提高比较特征鲁棒性和全局性,还可以将待处理图像分为若 干个互不重叠的区域bin,区域的尺寸为b×b,以bin为单位的比较特征定义如下公式。Optionally, in order to improve the robustness and globality of the comparison feature, the image to be processed can also be divided into The area bins that do not overlap each other, the size of the area is b × b, and the comparison feature in bins is defined as follows.
Figure PCTCN2017087702-appb-000009
Figure PCTCN2017087702-appb-000009
其中,li∈bini、lj∈binj,fcb指待处理图像中两个不同区域的像素比较特征。利用上述提到的颜色特征、梯度幅值特征、方向直方图特征,对待处理图像进行逐像素计算得到的,因而,当模型的尺寸固定后,不会因为训练过程的不同,而决定某特征是否计算。比较特征则不一样,依赖于模型训练过程。为了更好地融合颜色、梯度幅值、方向直方图特征与像素比较特征。Wherein, l i ∈bin i , l j ∈bin j , f cb refer to pixel comparison features of two different regions in the image to be processed. Using the above-mentioned color features, gradient magnitude features, and direction histogram features, the image to be processed is calculated pixel by pixel. Therefore, when the size of the model is fixed, it is not determined whether the feature is different because of the training process. Calculation. The comparison features are different and depend on the model training process. In order to better fuse color, gradient magnitude, direction histogram features and pixel comparison features.
首先,仅使用像素比较特征训练第一预设人脸模型,第一预设人脸模型的大小为n×n像素。则训练时,有(n/b)2×((n/b)2-1)/2种比较特征。使用adaboost方法进行训练,决策树的深度为5,个数为500。First, the first preset face model is trained using only the pixel comparison feature, and the size of the first preset face model is n×n pixels. Then, when training, there are (n/b) 2 × ((n/b) 2 -1)/2 comparison features. Training is performed using the adaboost method, which has a depth of 5 and a number of 500.
其次,训练之后,从第一预设人脸模型中挑选出的像素比较特征将大幅减少,该像素比较特征(即第五目标特征)的数目控制在10000以内。Secondly, after the training, the pixel comparison features selected from the first preset face model will be greatly reduced, and the number of the pixel comparison features (ie, the fifth target feature) is controlled within 10000.
然后,联合使用像第五目标特征以及第一目标特征(即:颜色特征、梯度幅值、方向直方图特征)训练第二预设人脸模型。仍然使用adaboost方法进行训练,决策树的深度为5,个数为500,并从训练后的第二预设人脸模型提取第二像素比较特征,得到第六目标特征;Then, the second preset face model is trained in combination using the fifth target feature and the first target feature (ie, color feature, gradient magnitude, and direction histogram feature). Still using the adaboost method for training, the depth of the decision tree is 5, the number is 500, and the second pixel comparison feature is extracted from the trained second preset face model to obtain the sixth target feature;
最后,将第一目标特征和第六目标特征组合为第二目标特征。Finally, the first target feature and the sixth target feature are combined into a second target feature.
因此,本发明联合使用了融合多通道特征与像素比较特征,克服了仅使用融合多通道特征时的人脸框位置不准确的问题,并进一步提高了逆光情况下的人脸的检出率。Therefore, the present invention combines the use of the fused multi-channel feature and the pixel comparison feature, overcomes the problem that the position of the face frame is inaccurate when only the fused multi-channel feature is used, and further improves the detection rate of the face in the case of backlighting.
106、采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,所述M为大于或等于1的整数。106. Determine, by using the M specified decision trees, the second target feature of the K group, and obtain a size and a position of the target face frame, where the M is an integer greater than or equal to 1.
其中,本发明实施例可采用M个指定决策树,其中,M为大于或等于1的整数,指定决策树将预设检测窗口内第二目标特征送入,对该第二目标特征进行决策,获取分数并累加得分,若得分低于某一阈值,则直接淘汰该窗口。若得分高于阈值,则在下一棵决策树上继续进行分类,获取分数并累加得分,直至遍历完所有决策树,将该窗口的位置坐标、宽、高信息转换到待处理图像上输出人脸框,包括人脸框的位置和大小。例如,检测完1个窗口后,可转到1.5进行下一个窗口的检测,直至遍历完特征金字塔的所有层,因而,可将最后得 到的所有人脸框进行合并,于是得到目标人脸框,进而,确定目标人脸框的位置和大小。如此,可进一步在识别到人脸的基础上,进行人脸识别。The embodiment of the present invention may adopt M designated decision trees, where M is an integer greater than or equal to 1, and the specified decision tree sends a second target feature in the preset detection window to make a decision on the second target feature. Get the score and accumulate the score. If the score is below a certain threshold, the window will be directly eliminated. If the score is higher than the threshold, continue to classify on the next decision tree, obtain the score and accumulate the score until all the decision trees are traversed, and convert the position coordinate, width and height information of the window to the image to be processed and output the face. Box, including the position and size of the face frame. For example, after detecting one window, you can go to 1.5 to perform the next window detection until you have traversed all the layers of the feature pyramid, so you can get the last All the faces of the faces are merged, so that the target face frame is obtained, and then the position and size of the target face frame are determined. In this way, face recognition can be further performed on the basis of recognizing the face.
可选地,上述采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,包括:Optionally, the determining, by using the M specified decision trees, the second target feature of the K group, and determining the size and location of the target face frame, including:
61)、在所述特征金字塔上,采用M个指定决策树对所述K组第二目标特征进行决策,得到X个人脸框,其中,所述X为大于或等于1的整数;61), on the feature pyramid, using the M specified decision trees to make a decision on the K target second target feature, to obtain an X personal face frame, wherein the X is an integer greater than or equal to 1;
62)、根据所述X个人脸框合并为所述目标人脸框的大小和位置。62), according to the X personal face frame merged into the size and position of the target face frame.
其中,步骤61Among them, step 61
其中,步骤62中,终端可利用非极大值抑制算法(Non-Maximum Suppression,NMS)算法将位置重叠的人脸框合并,输出最终的人脸框。In step 62, the terminal may merge the face frames with overlapping positions by using a Non-Maximum Suppression (NMS) algorithm to output a final face frame.
可以看出,通过本发明实施例,获取待处理图像,计算待处理图像的特征金字塔的层数,得到n层,n为大于或等于1的整数,基于n层,构造所述特征金字塔,在特征金字塔上,对K个预设检测窗口进行特征提取,得到K组第一目标特征,其中,每一组预设检测窗口对应一组第一目标特征,K为大于或等于1的整数,根据K组第一目标特征确定K组第二目标特征,采用M个指定决策树对K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,M为大于或等于1的整数。从而,可快速检测到人脸位置。It can be seen that, by using the embodiment of the present invention, the image to be processed is obtained, the number of layers of the feature pyramid of the image to be processed is calculated, and n layers are obtained, where n is an integer greater than or equal to 1, and the feature pyramid is constructed based on the n layer. On the feature pyramid, feature extraction is performed on K preset detection windows to obtain a first target feature of the K group, wherein each set of preset detection windows corresponds to a set of first target features, and K is an integer greater than or equal to 1, according to The first target feature of the K group determines the second target feature of the K group, and the M target decision tree is used to determine the second target feature of the K group, and the size and position of the target face frame are obtained, where M is an integer greater than or equal to 1. . Thereby, the face position can be detected quickly.
与上述一致地,以下为实施上述图像处理方法的装置,具体如下:Consistent to the above, the following is an apparatus for implementing the above image processing method, as follows:
请参阅图2a,为本发明实施例提供的一种终端的第一实施例结构示意图。本实施例中所描述的终端,包括:获取单元201、计算单元202、构造单元203、提取单元204、确定单元205和决策单元206,具体如下:FIG. 2 is a schematic structural diagram of a first embodiment of a terminal according to an embodiment of the present invention. The terminal described in this embodiment includes: an obtaining unit 201, a calculating unit 202, a constructing unit 203, an extracting unit 204, a determining unit 205, and a determining unit 206, as follows:
获取单元201,用于获取待处理图像;An obtaining unit 201, configured to acquire an image to be processed;
计算单元202,用于计算所述待处理图像的特征金字塔的层数,得到n层,所述n为大于或等于1的整数;The calculating unit 202 is configured to calculate a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, where n is an integer greater than or equal to 1;
构造单元203,用于基于所述n层,构造所述特征金字塔;The constructing unit 203 is configured to construct the feature pyramid based on the n layer;
提取单元204,用于在所述特征金字塔上,对K个预设检测窗口进行特征提取,得到所述K组第一目标特征,其中,每一组所述预设检测窗口对应一组第一目标特征,所述K为大于或等于1的整数;The extracting unit 204 is configured to perform feature extraction on the K preset detection windows on the feature pyramid to obtain the K group first target feature, where each group of the preset detection windows corresponds to a group of first a target feature, the K being an integer greater than or equal to 1;
确定单元205,用于根据所述K组第一目标特征确定所述K组第二目标特 征;a determining unit 205, configured to determine, according to the K group first target feature, the K group second target Levy
决策单元206,用于采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,所述M为大于或等于1的整数。The determining unit 206 is configured to determine the size and location of the target face frame by using the M specified decision trees to obtain the size and position of the target face frame, where the M is an integer greater than or equal to 1.
可选地,上述计算单元202具体用于:Optionally, the calculating unit 202 is specifically configured to:
根据所述待处理图像的尺寸和预设人脸检测模型的尺寸计算特征金字塔的层数,如下公式所示:Calculating the number of layers of the feature pyramid according to the size of the image to be processed and the size of the preset face detection model, as shown in the following formula:
Figure PCTCN2017087702-appb-000010
Figure PCTCN2017087702-appb-000010
其中,n表示所述特征金字塔的层数,kup是所述待处理图像上采样的倍数,wimg、himg分别表示所述待处理图像的宽度和高度,wm、hm分别所述预设人脸检测模型的宽度和高度,noctave指所述特征金字塔中每两倍尺寸之间的图像的层数。Where n is the number of layers of the feature pyramid, k up is a multiple of the sampled image to be processed, w img , h img respectively representing the width and height of the image to be processed, w m , h m respectively The width and height of the preset face detection model, n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
可选地,如图2b,图2b中所描述的终端的构造单元可包括:第一确定模块2031、第一提取模块2032、第二确定模块2033和构造模块2034,具体如下:Optionally, the configuration unit of the terminal as described in FIG. 2b and FIG. 2b may include: a first determining module 2031, a first extracting module 2032, a second determining module 2033, and a constructing module 2034, as follows:
第一确定模块2031,用于确定所述N层包含P个实特征层和Q个近似特征层,所述P为大于或等于1的整数,所述Q为大于或等于0的整数;The first determining module 2031 is configured to determine that the N layer includes P real feature layers and Q approximate feature layers, where P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
第一提取模块2032,用于对所述P个实特征层进行特征提取,得到第三目标特征;a first extraction module 2032, configured to perform feature extraction on the P real feature layers to obtain a third target feature;
第二确定模块2033,用于根据所述P个实特征层,确定所述Q个近似特征层的第四目标特征;a second determining module 2033, configured to determine, according to the P real feature layers, a fourth target feature of the Q approximate feature layers;
构造模块2034,用于将所述第三目标特征和所述第四目标特征构成所述特征金字塔。The constructing module 2034 is configured to form the third target feature and the fourth target feature to form the feature pyramid.
可选地,如图2c,图2c中所描述的终端的确定单元205可包括:第二提取模块2051、第一训练模块2052、第二训练模块2053和组合模块2054,具体如下:Optionally, the determining unit 205 of the terminal, as described in FIG. 2c and FIG. 2c, may include: a second extracting module 2051, a first training module 2052, a second training module 2053, and a combining module 2054, as follows:
第二提取模块2051,用于从所述K组第一目标特征中分别提取颜色特征,得到所述K组颜色特征;a second extraction module 2051, configured to separately extract color features from the K group first target features to obtain the K group color features;
第一训练模块2052,用于对第i组颜色特征计算像素比较特征,基于所述计算像素比较特征训练第一预设人脸模型,并从训练后的所述第一预设人脸模型提取第一目标像素比较特征,得到第五目标特征,其中,所述第i组颜色特征为所述K组颜色特征中的任一组颜色特征; The first training module 2052 is configured to calculate a pixel comparison feature for the i-th color feature, train the first preset face model based on the calculated pixel comparison feature, and extract the first preset face model from the training The first target pixel compares the feature to obtain a fifth target feature, wherein the ith set of color features is any one of the K sets of color features;
第二训练模块2053,用于通过所述第五目标特征和所述第一目标特征训练第二预设人脸模型,并从训练后的所述第二预设人脸模型提取第二像素比较特征,得到第六目标特征;a second training module 2053, configured to train a second preset face model by using the fifth target feature and the first target feature, and extracting a second pixel comparison from the trained second preset face model Feature, obtaining a sixth target feature;
组合模块2054,用于将所述第一目标特征和所述第六目标特征组合为所述第二目标特征。The combining module 2054 is configured to combine the first target feature and the sixth target feature into the second target feature.
可选地,如图2d,图2a中所描述的终端的决策单元206可包括:决策模块2061和合并模块2062,具体如下:Optionally, the decision unit 206 of the terminal as described in FIG. 2d and FIG. 2a may include: a decision module 2061 and a merge module 2062, as follows:
决策模块2061,用于在所述特征金字塔上,采用M个指定决策树对所述K组第二目标特征进行决策,得到X个人脸框,其中,所述X为大于或等于1的整数;The decision module 2061 is configured to determine, by using the M specified decision trees, the second target feature of the K group on the feature pyramid to obtain an X personal face frame, where the X is an integer greater than or equal to 1;
合并模块2062,用于根据所述X个人脸框合并为所述目标人脸框的大小和位置。The merging module 2062 is configured to merge the size and position of the target face frame according to the X personal face frame.
可以看出,通过本发明实施例所描述的终端,获取待处理图像,计算待处理图像的特征金字塔的层数,得到n层,n为大于或等于1的整数,基于n层,构造所述特征金字塔,在特征金字塔上,对K个预设检测窗口进行特征提取,得到K组第一目标特征,其中,每一组预设检测窗口对应一组第一目标特征,K为大于或等于1的整数,根据K组第一目标特征确定K组第二目标特征,采用M个指定决策树对K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,M为大于或等于1的整数。从而,可快速检测到人脸位置。It can be seen that, by using the terminal described in the embodiment of the present invention, the image to be processed is acquired, and the number of layers of the feature pyramid of the image to be processed is calculated to obtain an n layer, where n is an integer greater than or equal to 1, and the n layer is constructed. Feature pyramid, on the feature pyramid, feature extraction of K preset detection windows to obtain K group first target features, wherein each set of preset detection windows corresponds to a set of first target features, K is greater than or equal to 1 The integer is determined according to the first target feature of the K group, and the second target feature of the K group is determined by the M specified decision trees, and the size and position of the target face frame are obtained, wherein M is greater than or An integer equal to 1. Thereby, the face position can be detected quickly.
与上述一致地,请参阅图3,为本发明实施例提供的一种终端的第二实施例结构示意图。本实施例中所描述的终端,包括:至少一个输入设备1000;至少一个输出设备2000;至少一个处理器3000,例如CPU;和存储器4000,上述输入设备1000、输出设备2000、处理器3000和存储器4000通过总线5000连接。With reference to FIG. 3, it is a schematic structural diagram of a second embodiment of a terminal according to an embodiment of the present invention. The terminal described in this embodiment includes: at least one input device 1000; at least one output device 2000; at least one processor 3000, such as a CPU; and a memory 4000, the input device 1000, the output device 2000, the processor 3000, and the memory 4000 is connected via bus 5000.
其中,上述输入设备1000具体可为触控面板、物理按键或者鼠标。The input device 1000 may be a touch panel, a physical button, or a mouse.
上述输出设备2000具体可为显示屏。The output device 2000 described above may specifically be a display screen.
上述存储器4000可以是高速RAM存储器,也可为非易失存储器(non-volatile memory),例如磁盘存储器。上述存储器4000用于存储一组程序代码,上述输入设备1000、输出设备2000和处理器3000用于调用存储器4000 中存储的程序代码,执行如下操作:The above memory 4000 may be a high speed RAM memory or a non-volatile memory such as a magnetic disk memory. The above memory 4000 is used to store a set of program codes, and the input device 1000, the output device 2000, and the processor 3000 are used to call the memory 4000. In the program code stored in, do the following:
上述处理器3000,用于:The processor 3000 is configured to:
获取待处理图像;Get the image to be processed;
计算所述待处理图像的特征金字塔的层数,得到n层,所述n为大于或等于1的整数;Calculating a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, wherein n is an integer greater than or equal to 1;
基于所述n层,构造所述特征金字塔;Constructing the feature pyramid based on the n layers;
在所述特征金字塔上,对K个预设检测窗口进行特征提取,得到所述K组第一目标特征,其中,每一组所述预设检测窗口对应一组第一目标特征,所述K为大于或等于1的整数;Performing feature extraction on the K preset detection windows to obtain the K group first target features, wherein each set of the preset detection windows corresponds to a set of first target features, the K An integer greater than or equal to 1;
根据所述K组第一目标特征确定所述K组第二目标特征;Determining the K group second target feature according to the K group first target feature;
采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,所述M为大于或等于1的整数。Determining the K target second target feature by using M designated decision trees, and obtaining the size and position of the target human face frame, wherein the M is an integer greater than or equal to 1.
可选地,上述处理器3000计算所述待处理图像的特征金字塔的层数,得到n层,包括:Optionally, the processor 3000 calculates the number of layers of the feature pyramid of the image to be processed, and obtains n layers, including:
根据所述待处理图像的尺寸和预设人脸检测模型的尺寸计算特征金字塔的层数,如下公式所示:Calculating the number of layers of the feature pyramid according to the size of the image to be processed and the size of the preset face detection model, as shown in the following formula:
Figure PCTCN2017087702-appb-000011
Figure PCTCN2017087702-appb-000011
其中,n表示所述特征金字塔的层数,kup是所述待处理图像上采样的倍数,wimg、himg分别表示所述待处理图像的宽度和高度,wm、hm分别所述预设人脸检测模型的宽度和高度,noctave指所述特征金字塔中每两倍尺寸之间的图像的层数。Where n is the number of layers of the feature pyramid, k up is a multiple of the sampled image to be processed, w img , h img respectively representing the width and height of the image to be processed, w m , h m respectively The width and height of the preset face detection model, n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
可选地,上述处理器3000基于所述N层,构造所述特征金字塔,包括:Optionally, the foregoing processor 3000 constructs the feature pyramid based on the N layer, including:
确定所述N层包含P个实特征层和Q个近似特征层,所述P为大于或等于1的整数,所述Q为大于或等于0的整数;Determining that the N layer comprises P real feature layers and Q approximate feature layers, wherein P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
对所述P个实特征层进行特征提取,得到第三目标特征;Performing feature extraction on the P real feature layers to obtain a third target feature;
根据所述P个实特征层,确定所述Q个近似特征层的第四目标特征;Determining, according to the P real feature layers, a fourth target feature of the Q approximate feature layers;
将所述第三目标特征和所述第四目标特征构成所述特征金字塔。The third target feature and the fourth target feature constitute the feature pyramid.
可选地,上述处理器3000根据所述K组第一目标特征确定所述K组第二目标特征,包括:Optionally, the processor 3000 determines, according to the K group first target feature, the K group second target feature, including:
从所述K组第一目标特征中分别提取颜色特征,得到所述K组颜色特征; Extracting color features from the K group first target features to obtain the K group color features;
对第i组颜色特征计算像素比较特征,基于所述计算像素比较特征训练第一预设人脸模型,并从训练后的所述第一预设人脸模型提取第一目标像素比较特征,得到第五目标特征,其中,所述第i组颜色特征为所述K组颜色特征中的任一组颜色特征;Calculating a pixel comparison feature for the i-th color feature, training the first preset face model based on the calculated pixel comparison feature, and extracting the first target pixel comparison feature from the trained first preset face model, a fifth target feature, wherein the ith group color feature is any one of the K group color features;
通过所述第五目标特征和所述第一目标特征训练第二预设人脸模型,并从训练后的所述第二预设人脸模型提取第二像素比较特征,得到第六目标特征;Training a second preset face model by using the fifth target feature and the first target feature, and extracting a second pixel comparison feature from the trained second preset face model to obtain a sixth target feature;
将所述第一目标特征和所述第六目标特征组合为所述第二目标特征。Combining the first target feature and the sixth target feature into the second target feature.
可选地,上述处理器3000采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,包括:Optionally, the processor 3000 determines, by using the M specified decision trees, the second target feature of the K group, and obtains the size and location of the target face frame, including:
在所述特征金字塔上,采用M个指定决策树对所述K组第二目标特征进行决策,得到X个人脸框,其中,所述X为大于或等于1的整数;Determining, by the M specified decision trees, the K target second target features on the feature pyramid to obtain an X personal face frame, wherein the X is an integer greater than or equal to 1;
根据所述X个人脸框合并为所述目标人脸框的大小和位置。According to the X personal face frame, the size and position of the target face frame are merged.
本发明实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的任何一种图像处理方法的部分或全部步骤。The embodiment of the present invention further provides a computer storage medium, wherein the computer storage medium can store a program, and the program includes some or all of the steps of any one of the image processing methods described in the foregoing method embodiments.
尽管在此结合各实施例对本发明进行了描述,然而,在实施所要求保护的本发明过程中,本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其他变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。Although the present invention has been described herein in connection with the embodiments of the present invention, it will be understood by those skilled in the <RTIgt; Other variations of the disclosed embodiments are achieved. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "an" does not exclude a plurality. A single processor or other unit may fulfill several of the functions recited in the claims. Certain measures are recited in mutually different dependent claims, but this does not mean that the measures are not combined to produce a good effect.
本领域技术人员应明白,本发明的实施例可提供为方法、装置(设备)、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机程序存储/分布在合适的介质中,与其它硬件一起提供或作为硬件的一部分,也可以采用其他分布形式,如通过Internet或其它有线或无线电信系统。 Those skilled in the art will appreciate that embodiments of the present invention can be provided as a method, apparatus (device), or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code. The computer program is stored/distributed in a suitable medium, provided with other hardware or as part of the hardware, or in other distributed forms, such as over the Internet or other wired or wireless telecommunication systems.
本发明是参照本发明实施例的方法、装置(设备)和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention has been described with reference to flowchart illustrations and/or block diagrams of the methods, apparatus, and computer program products of the embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
尽管结合具体特征及其实施例对本发明进行了描述,显而易见的,在不脱离本发明的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本发明的示例性说明,且视为已覆盖本发明范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。 While the invention has been described with respect to the specific embodiments and embodiments thereof, various modifications and combinations may be made without departing from the spirit and scope of the invention. Accordingly, the specification and drawings are to be construed as the It is apparent that those skilled in the art can make various modifications and variations to the invention without departing from the spirit and scope of the invention. Thus, it is intended that the present invention cover the modifications and modifications of the invention

Claims (10)

  1. 一种图像处理方法,其特征在于,包括:An image processing method, comprising:
    获取待处理图像;Get the image to be processed;
    计算所述待处理图像的特征金字塔的层数,得到n层,所述n为大于或等于1的整数;Calculating a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, wherein n is an integer greater than or equal to 1;
    基于所述n层,构造所述特征金字塔;Constructing the feature pyramid based on the n layers;
    在所述特征金字塔上,对K个预设检测窗口进行特征提取,得到所述K组第一目标特征,其中,每一组所述预设检测窗口对应一组第一目标特征,所述K为大于或等于1的整数;Performing feature extraction on the K preset detection windows to obtain the K group first target features, wherein each set of the preset detection windows corresponds to a set of first target features, the K An integer greater than or equal to 1;
    根据所述K组第一目标特征确定所述K组第二目标特征;Determining the K group second target feature according to the K group first target feature;
    采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,所述M为大于或等于1的整数。Determining the K target second target feature by using M designated decision trees, and obtaining the size and position of the target human face frame, wherein the M is an integer greater than or equal to 1.
  2. 根据权利要求1所述的方法,其特征在于,所述计算所述待处理图像的特征金字塔的层数,得到n层,包括:The method according to claim 1, wherein the calculating the number of layers of the feature pyramid of the image to be processed to obtain the n layer comprises:
    根据所述待处理图像的尺寸和预设人脸检测模型的尺寸计算特征金字塔的层数,如下公式所示:Calculating the number of layers of the feature pyramid according to the size of the image to be processed and the size of the preset face detection model, as shown in the following formula:
    Figure PCTCN2017087702-appb-100001
    Figure PCTCN2017087702-appb-100001
    其中,n表示所述特征金字塔的层数,kup是所述待处理图像上采样的倍数,wimg、himg分别表示所述待处理图像的宽度和高度,wm、hm分别所述预设人脸检测模型的宽度和高度,noctave指所述特征金字塔中每两倍尺寸之间的图像的层数。Where n is the number of layers of the feature pyramid, k up is a multiple of the sampled image to be processed, w img , h img respectively representing the width and height of the image to be processed, w m , h m respectively The width and height of the preset face detection model, n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
  3. 根据权利要求1或2任一项所述的方法,其特征在于,所述基于所述N层,构造所述特征金字塔,包括:The method according to any one of claims 1 or 2, wherein the constructing the feature pyramid based on the N layer comprises:
    确定所述N层包含P个实特征层和Q个近似特征层,所述P为大于或等于1的整数,所述Q为大于或等于0的整数;Determining that the N layer comprises P real feature layers and Q approximate feature layers, wherein P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
    对所述P个实特征层进行特征提取,得到第三目标特征; Performing feature extraction on the P real feature layers to obtain a third target feature;
    根据所述P个实特征层,确定所述Q个近似特征层的第四目标特征;Determining, according to the P real feature layers, a fourth target feature of the Q approximate feature layers;
    将所述第三目标特征和所述第四目标特征构成所述特征金字塔。The third target feature and the fourth target feature constitute the feature pyramid.
  4. 根据权利要求1或2任一项所述的方法,其特征在于,所述根据所述K组第一目标特征确定所述K组第二目标特征,包括:The method according to any one of claims 1 to 2, wherein the determining the K target second target feature according to the K group first target feature comprises:
    从所述K组第一目标特征中分别提取颜色特征,得到所述K组颜色特征;Extracting color features from the K group first target features to obtain the K group color features;
    对第i组颜色特征计算像素比较特征,基于所述计算像素比较特征训练第一预设人脸模型,并从训练后的所述第一预设人脸模型提取第一目标像素比较特征,得到第五目标特征,其中,所述第i组颜色特征为所述K组颜色特征中的任一组颜色特征;Calculating a pixel comparison feature for the i-th color feature, training the first preset face model based on the calculated pixel comparison feature, and extracting the first target pixel comparison feature from the trained first preset face model, a fifth target feature, wherein the ith group color feature is any one of the K group color features;
    通过所述第五目标特征和所述第一目标特征训练第二预设人脸模型,并从训练后的所述第二预设人脸模型提取第二像素比较特征,得到第六目标特征;Training a second preset face model by using the fifth target feature and the first target feature, and extracting a second pixel comparison feature from the trained second preset face model to obtain a sixth target feature;
    将所述第一目标特征和所述第六目标特征组合为所述第二目标特征。Combining the first target feature and the sixth target feature into the second target feature.
  5. 根据权利要求1或2任一项所述的方法,其特征在于,所述采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,包括:The method according to any one of claims 1 to 2, wherein the determining, by using the M designated decision trees, the second target feature of the K group to obtain the size and position of the target face frame, including:
    在所述特征金字塔上,采用M个指定决策树对所述K组第二目标特征进行决策,得到X个人脸框,其中,所述X为大于或等于1的整数;Determining, by the M specified decision trees, the K target second target features on the feature pyramid to obtain an X personal face frame, wherein the X is an integer greater than or equal to 1;
    根据所述X个人脸框合并为所述目标人脸框的大小和位置。According to the X personal face frame, the size and position of the target face frame are merged.
  6. 一种终端,其特征在于,包括:A terminal, comprising:
    获取单元,用于获取待处理图像;An obtaining unit, configured to acquire an image to be processed;
    计算单元,用于计算所述待处理图像的特征金字塔的层数,得到n层,所述n为大于或等于1的整数;a calculating unit, configured to calculate a number of layers of the feature pyramid of the image to be processed, to obtain an n layer, wherein n is an integer greater than or equal to 1;
    构造单元,用于基于所述n层,构造所述特征金字塔;a constructing unit configured to construct the feature pyramid based on the n layers;
    提取单元,用于在所述特征金字塔上,对K个预设检测窗口进行特征提取,得到所述K组第一目标特征,其中,每一组所述预设检测窗口对应一组第一目标特征,所述K为大于或等于1的整数;An extracting unit, configured to perform feature extraction on the K preset detection windows on the feature pyramid to obtain the K group first target feature, wherein each set of the preset detection window corresponds to a group of first targets a characteristic, the K being an integer greater than or equal to 1;
    确定单元,用于根据所述K组第一目标特征确定所述K组第二目标特征; a determining unit, configured to determine the K group second target feature according to the K group first target feature;
    决策单元,用于采用M个指定决策树对所述K组第二目标特征进行决策,得到目标人脸框的大小和位置,其中,所述M为大于或等于1的整数。And a decision unit, configured to determine the size and location of the target face frame by using the M specified decision trees, where the M is an integer greater than or equal to 1.
  7. 根据权利要求6所述的终端,其特征在于,所述计算单元具体用于:The terminal according to claim 6, wherein the calculating unit is specifically configured to:
    根据所述待处理图像的尺寸和预设人脸检测模型的尺寸计算特征金字塔的层数,如下公式所示:Calculating the number of layers of the feature pyramid according to the size of the image to be processed and the size of the preset face detection model, as shown in the following formula:
    Figure PCTCN2017087702-appb-100002
    Figure PCTCN2017087702-appb-100002
    其中,n表示所述特征金字塔的层数,kup是所述待处理图像上采样的倍数,wimg、himg分别表示所述待处理图像的宽度和高度,wm、hm分别所述预设人脸检测模型的宽度和高度,noctave指所述特征金字塔中每两倍尺寸之间的图像的层数。Where n is the number of layers of the feature pyramid, k up is a multiple of the sampled image to be processed, w img , h img respectively representing the width and height of the image to be processed, w m , h m respectively The width and height of the preset face detection model, n octave refers to the number of layers of the image between each of the two dimensions in the feature pyramid.
  8. 根据权利要求6或7任一项所述的终端,其特征在于,所述构造单元包括:The terminal according to any one of claims 6 or 7, wherein the construction unit comprises:
    第一确定模块,用于确定所述N层包含P个实特征层和Q个近似特征层,所述P为大于或等于1的整数,所述Q为大于或等于0的整数;a first determining module, configured to determine that the N layer includes P real feature layers and Q approximate feature layers, where P is an integer greater than or equal to 1, and the Q is an integer greater than or equal to 0;
    第一提取模块,用于对所述P个实特征层进行特征提取,得到第三目标特征;a first extraction module, configured to perform feature extraction on the P real feature layers to obtain a third target feature;
    第二确定模块,用于根据所述P个实特征层,确定所述Q个近似特征层的第四目标特征;a second determining module, configured to determine, according to the P real feature layers, a fourth target feature of the Q approximate feature layers;
    构造模块,用于将所述第三目标特征和所述第四目标特征构成所述特征金字塔。And a constructing module, configured to form the third target feature and the fourth target feature into the feature pyramid.
  9. 根据权利要求6或7任一项所述的终端,其特征在于,所述确定单元包括:The terminal according to any one of claims 6 or 7, wherein the determining unit comprises:
    第二提取模块,用于从所述K组第一目标特征中分别提取颜色特征,得到所述K组颜色特征;a second extraction module, configured to separately extract color features from the K group first target features to obtain the K group color features;
    第一训练模块,用于对第i组颜色特征计算像素比较特征,基于所述计算像素比较特征训练第一预设人脸模型,并从训练后的所述第一预设人脸模型提取第一目标像素比较特征,得到第五目标特征,其中,所述第i组颜色特征为所述 K组颜色特征中的任一组颜色特征;a first training module, configured to calculate a pixel comparison feature for the i-th color feature, train the first preset face model based on the calculated pixel comparison feature, and extract the first preset face model from the training a target pixel comparison feature to obtain a fifth target feature, wherein the ith group of color features is Any one of the K sets of color features;
    第二训练模块,用于通过所述第五目标特征和所述第一目标特征训练第二预设人脸模型,并从训练后的所述第二预设人脸模型提取第二像素比较特征,得到第六目标特征;a second training module, configured to train a second preset face model by using the fifth target feature and the first target feature, and extract a second pixel comparison feature from the trained second preset face model , obtaining the sixth target feature;
    组合模块,用于将所述第一目标特征和所述第六目标特征组合为所述第二目标特征。And a combination module, configured to combine the first target feature and the sixth target feature into the second target feature.
  10. 根据权利要求6或7任一项所述的终端,其特征在于,所述决策单元包括:The terminal according to any one of claims 6 or 7, wherein the decision unit comprises:
    决策模块,用于在所述特征金字塔上,采用M个指定决策树对所述K组第二目标特征进行决策,得到X个人脸框,其中,所述X为大于或等于1的整数;a decision module, configured to determine, by using the M specified decision trees, the second target feature of the K group on the feature pyramid, to obtain an X personal face frame, where the X is an integer greater than or equal to 1;
    合并模块,用于根据所述X个人脸框合并为所述目标人脸框的大小和位置。 And a merging module, configured to merge the size and position of the target face frame according to the X personal face frame.
PCT/CN2017/087702 2016-11-07 2017-06-09 Image processing method and terminal WO2018082308A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610982791.1 2016-11-07
CN201610982791.1A CN106650615B (en) 2016-11-07 2016-11-07 A kind of image processing method and terminal

Publications (1)

Publication Number Publication Date
WO2018082308A1 true WO2018082308A1 (en) 2018-05-11

Family

ID=58806382

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/087702 WO2018082308A1 (en) 2016-11-07 2017-06-09 Image processing method and terminal

Country Status (2)

Country Link
CN (1) CN106650615B (en)
WO (1) WO2018082308A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650615B (en) * 2016-11-07 2018-03-27 深圳云天励飞技术有限公司 A kind of image processing method and terminal
CN108229297B (en) * 2017-09-30 2020-06-05 深圳市商汤科技有限公司 Face recognition method and device, electronic equipment and computer storage medium
CN109727188A (en) * 2017-10-31 2019-05-07 比亚迪股份有限公司 Image processing method and its device, safe driving method and its device
CN108090417A (en) * 2017-11-27 2018-05-29 上海交通大学 A kind of method for detecting human face based on convolutional neural networks
CN109918969B (en) * 2017-12-12 2021-03-05 深圳云天励飞技术有限公司 Face detection method and device, computer device and computer readable storage medium
CN112424787A (en) * 2018-09-20 2021-02-26 华为技术有限公司 Method and device for extracting image key points
WO2020118554A1 (en) * 2018-12-12 2020-06-18 Paypal, Inc. Binning for nonlinear modeling
CN109902576B (en) * 2019-01-25 2021-05-18 华中科技大学 Training method and application of head and shoulder image classifier
CN109871829B (en) * 2019-03-15 2021-06-04 北京行易道科技有限公司 Detection model training method and device based on deep learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080232698A1 (en) * 2007-03-21 2008-09-25 Ricoh Company, Ltd. Object image detection method and object image detection device
CN102831411A (en) * 2012-09-07 2012-12-19 云南晟邺科技有限公司 Quick face detection method
CN103049751A (en) * 2013-01-24 2013-04-17 苏州大学 Improved weighting region matching high-altitude video pedestrian recognizing method
CN103778430A (en) * 2014-02-24 2014-05-07 东南大学 Rapid face detection method based on combination between skin color segmentation and AdaBoost
CN106650615A (en) * 2016-11-07 2017-05-10 深圳云天励飞技术有限公司 Image processing method and terminal

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101567048B (en) * 2008-04-21 2012-06-06 夏普株式会社 Image identifying device and image retrieving device
CN105512638B (en) * 2015-12-24 2018-07-31 王华锋 A kind of Face datection and alignment schemes based on fusion feature

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080232698A1 (en) * 2007-03-21 2008-09-25 Ricoh Company, Ltd. Object image detection method and object image detection device
CN102831411A (en) * 2012-09-07 2012-12-19 云南晟邺科技有限公司 Quick face detection method
CN103049751A (en) * 2013-01-24 2013-04-17 苏州大学 Improved weighting region matching high-altitude video pedestrian recognizing method
CN103778430A (en) * 2014-02-24 2014-05-07 东南大学 Rapid face detection method based on combination between skin color segmentation and AdaBoost
CN106650615A (en) * 2016-11-07 2017-05-10 深圳云天励飞技术有限公司 Image processing method and terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ABRAMSON, Y. ET AL.: "Yet Even Faster (YEF) Real-Time Object Detection", INT. J. INTELLIGENT SYSTEMS TECHNOLOGIES AND APPLICATIONS, vol. 2, no. 2/3, 30 June 2007 (2007-06-30), XP055481201 *

Also Published As

Publication number Publication date
CN106650615A (en) 2017-05-10
CN106650615B (en) 2018-03-27

Similar Documents

Publication Publication Date Title
WO2018082308A1 (en) Image processing method and terminal
US10872262B2 (en) Information processing apparatus and information processing method for detecting position of object
JP5554984B2 (en) Pattern recognition method and pattern recognition apparatus
US11087169B2 (en) Image processing apparatus that identifies object and method therefor
WO2017190646A1 (en) Facial image processing method and apparatus and storage medium
WO2017088432A1 (en) Image recognition method and device
WO2019114036A1 (en) Face detection method and device, computer device, and computer readable storage medium
CN108446694B (en) Target detection method and device
CN109960742B (en) Local information searching method and device
WO2020199478A1 (en) Method for training image generation model, image generation method, device and apparatus, and storage medium
WO2018090937A1 (en) Image processing method, terminal and storage medium
JP6482195B2 (en) Image recognition apparatus, image recognition method, and program
US9626552B2 (en) Calculating facial image similarity
CN110765860A (en) Tumble determination method, tumble determination device, computer apparatus, and storage medium
WO2013086255A1 (en) Motion aligned distance calculations for image comparisons
CN109886223B (en) Face recognition method, bottom library input method and device and electronic equipment
WO2022174523A1 (en) Method for extracting gait feature of pedestrian, and gait recognition method and system
WO2023151237A1 (en) Face pose estimation method and apparatus, electronic device, and storage medium
US11741615B2 (en) Map segmentation method and device, motion estimation method, and device terminal
CN106407978B (en) Method for detecting salient object in unconstrained video by combining similarity degree
WO2015186347A1 (en) Detection system, detection method, and program storage medium
WO2021084972A1 (en) Object tracking device and object tracking method
WO2020001016A1 (en) Moving image generation method and apparatus, and electronic device and computer-readable storage medium
WO2021179822A1 (en) Human body feature point detection method and apparatus, electronic device, and storage medium
JP6202938B2 (en) Image recognition apparatus and image recognition method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17867013

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17867013

Country of ref document: EP

Kind code of ref document: A1