WO2018188453A1 - Procédé de détermination d'une zone de visage humain, support de stockage et dispositif informatique - Google Patents

Procédé de détermination d'une zone de visage humain, support de stockage et dispositif informatique Download PDF

Info

Publication number
WO2018188453A1
WO2018188453A1 PCT/CN2018/079551 CN2018079551W WO2018188453A1 WO 2018188453 A1 WO2018188453 A1 WO 2018188453A1 CN 2018079551 W CN2018079551 W CN 2018079551W WO 2018188453 A1 WO2018188453 A1 WO 2018188453A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
region
neural network
convolutional neural
area
Prior art date
Application number
PCT/CN2018/079551
Other languages
English (en)
Chinese (zh)
Inventor
王亚彪
倪辉
赵艳丹
汪铖杰
李季檩
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018188453A1 publication Critical patent/WO2018188453A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Definitions

  • the present application relates to the field of image processing, and in particular to a method for determining a face region, a storage medium, and a computer device.
  • Face recognition is a biometric recognition technology based on human facial feature information for identification. Using a camera or camera to capture an image or video stream containing a face, and automatically detect and track the face in the image, and then perform a series of related techniques on the face of the detected face, usually called portrait recognition and face recognition.
  • Face detection has been widely studied as a basis for applications such as face recognition, face key location, and face retrieval. Face detection is to determine whether there is a face in the image from a given image. If it exists, the size and position of the face are given. As shown in Figure 1, the image on the left is performed. Detection, the right image is obtained, and the face area (ie, the dotted area) is identified.
  • the main difficulty lies in the following two aspects: the face itself can have various forms of details. Changes, such as changes in skin color, face, expression, and face pose; faces in images are also affected by a variety of external factors, such as lighting, camera shake, and occlusion from the ornament on the human face. Wait.
  • the face detection methods are various and can be classified into a feature-based detection method and a statistical model-based detection method.
  • the feature-based face detection method is mainly based on some empirical rules and artificially constructed features for face detection, such as detection methods based on some facial organ structures and texture features.
  • the statistical model based detection method also needs to be extracted first on the sample.
  • face detection based on statistical models is not purely based on some set rules, but a large number of samples are used to train the detector model.
  • support vector machine SVM A face detection algorithm called Support Vector Machine, based on Adaboost's face detection algorithm.
  • the commonly used indicators for evaluating face detection methods are mainly the following: (1) Detection rate, that is, the number of faces that are correctly detected and the total person in the image in a given image set. The ratio between the number of faces; (2) the number of false detections, that is, the number of non-face areas detected as the face area, the ideal face detector should have a detection rate of 100% and 0 errors.
  • the feature-based detection method in the related art is used, and it is easy to be affected by the subjective factors of the user due to the need to use the empirical rules and the artificially constructed features.
  • the detection rate and robustness of face recognition if the statistical model-based detection method in the related technology is used, the commonly used models often have more layers to ensure the accuracy of recognition, which leads to a larger model. These models all exceed 15MB. Although the number of layers will ensure the accuracy of recognition, the increase of the number of layers will bring about the defect of face detection speed (more than 300ms on mainstream PC), which cannot meet the requirements of real-time.
  • the embodiment of the present application provides a method for determining a face area, a storage medium, and a computer device, to at least solve the technical problem of poor real-time detection of face detection in the related art.
  • a method for determining a face area includes: receiving a location request, where the location request is used to request to locate a face area in a target picture;
  • the convolutional neural network performs a face positioning operation on the target image to obtain a positioning result, wherein the convolutional neural network is used to call the graphics processor to perform a convolution operation on the target image, and the face positioning operation includes a convolution operation;
  • the positioning result is returned.
  • a computer device (or a determining device for a face region), comprising a memory and a processor, the processor being configured to execute a computer program saved in the memory: receiving the positioning a request, wherein the positioning request is used to request to locate a face region in the target picture; performing a face positioning operation on the target image by using a convolutional neural network to obtain a positioning result, wherein the convolutional neural network is used to invoke a graphics processor pair
  • the target image is subjected to a convolution operation, and the face positioning operation includes a convolution operation; and when the positioning result is used to indicate that the presence of the face region is located in the target image, the positioning result is returned.
  • a storage medium comprising a stored program, wherein the program is configured to execute any of the methods described above at runtime.
  • the convolution operation is directly called by the graphics processor through the full convolution network in the convolutional neural network, and the hardware acceleration method is adopted.
  • the software processing method of scanning by region by the CPU the technical problem of poor real-time detection of the face detection in the related art can be solved, thereby achieving the technical effect of improving the real-time performance of the face detection.
  • FIG. 1 is a schematic diagram of an optional face region in the related art
  • FIG. 2 is a schematic diagram of a hardware environment of a method for determining a face region according to an embodiment of the present application
  • FIG. 3 is a flowchart of an optional method for determining a face region according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of an optional degree of face coincidence according to an embodiment of the present application.
  • FIG. 5 is a schematic illustration of an alternative sample in accordance with an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an optional network structure according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an optional face region according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an optional face region according to an embodiment of the present application.
  • FIG. 9 is a flowchart of an optional method for determining a face region according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of an alternative probability map in accordance with an embodiment of the present application.
  • FIG. 11 is a schematic diagram of an alternative computer device in accordance with an embodiment of the present application.
  • FIG. 12 is a structural block diagram of a terminal according to an embodiment of the present application.
  • the Convolutional Neural Network is a feedforward neural network. Its artificial neurons can respond to a part of the coverage area. It has excellent performance for large-scale image processing. It mainly includes convolutional layers and pools. Layer.
  • Adaboost An iterative algorithm that can be used to train different classifiers for the same training set, and then combine these classifiers to form a stronger classifier.
  • a method embodiment of a method for determining a face region is provided.
  • the method for determining the face region may be applied to a hardware environment formed by the server 202 and the terminal 204 as shown in FIG. 2 .
  • the server 202 is connected to the terminal 204 through a network.
  • the network includes but is not limited to a wide area network, a metropolitan area network, or a local area network.
  • the terminal 204 is not limited to a PC, a mobile phone, a tablet, or the like.
  • the face recognition function provided by the method of the present application may be directly integrated on the terminal, or a client for implementing the method of the present application may be installed, so that the terminal receives the message.
  • the face image is operated by the convolutional neural network to obtain the positioning result, and the convolutional neural network is used to call the graphics processor to convolve the target image.
  • the face positioning operation includes a convolution operation; and when the positioning result is used to indicate that the face area exists in the target picture, the positioning result is returned.
  • the method provided by the present application may also be run on a server or the like in the form of a Software Development Kit (SDK), and provided to an application in the form of an SDK, providing an interface for the face recognition function, and other devices. Face area recognition can be achieved through the provided interface.
  • SDK Software Development Kit
  • the server When receiving the positioning request sent by the other device through the interface, the server performs a face positioning operation on the target image through the convolutional neural network to obtain a positioning result, and the convolutional neural network is used to call the graphics processor to perform convolution operation on the target image.
  • the face positioning operation includes a convolution operation; when the positioning result is used to indicate that the presence of the face region in the target image, the positioning result is returned to the device that initiated the request.
  • FIG. 3 is a flowchart of an optional method for determining a face region according to an embodiment of the present application. As shown in FIG. 3, the method may include the following steps:
  • Step S302 the server receives a positioning request, and the positioning request is used to request to locate a face area in the target picture;
  • Step S304 the server performs a face positioning operation on the target image through the deployed convolutional neural network to obtain a positioning result, and the convolutional neural network is used to invoke a graphics processor to perform a convolution operation on the target image, and the face positioning operation includes convolution. operating;
  • Step S306 in the case that the positioning result is used to indicate that the presence of the face area is located in the target picture, the server returns the positioning result to the object that initiated the positioning request or the receiver indicated by the positioning request.
  • the computer device of the present application may be the above-mentioned server or terminal.
  • the above embodiment is schematically illustrated by the server 202 as a method for determining the face area.
  • the method for determining the face area of the present application may also be performed by the terminal 204. That is, the execution body of the above step may be replaced by a server, or may be performed by the server 202 and the terminal 204.
  • the terminal initiates a positioning request
  • the server completes the positioning, and returns the positioning result to the terminal.
  • the determining method of the face area by the terminal 204 in the embodiment of the present application may also be performed by a client installed thereon. For the unification of the description, the following method is performed by the server as an example for detailed description.
  • the face image is operated by the convolutional neural network to obtain a positioning result, and the positioning result is used to represent the presence of the face region in the target image.
  • the convolution operation is directly performed by calling the graphics processor through the full convolution network in the convolutional neural network, and the hardware acceleration method is adopted.
  • the software processing method of performing area-by-area scanning by the CPU can solve the technical problem of poor real-time detection of the face detection in the related art, thereby achieving the technical effect of improving the real-time performance of the face detection.
  • the face detection algorithm in the related art has many problems in the general application scenario.
  • the feature-based face detection has a fast detection speed, but for a slightly complex scene, the detection rate of the algorithm is low and lacks robustness;
  • the Adaboost face detection algorithm is small in model and fast in detection speed, it is less robust to complex scenes, such as face detection in extreme scenes, such as masks, black-rimmed glasses, and blurred images. .
  • the convolutional neural networks there are mainly three convolutional neural networks, namely the first-level convolutional neural network net-1, the second-level convolutional neural network net-2, and the third-level convolutional neural network net-3.
  • These three networks can adopt a cascading structure, given an image, output a candidate face frame set through net-1, and input the candidate set to net-2 to obtain a more accurate candidate face frame set, and then The obtained candidate set is input to net-3, and the final face frame set is obtained, that is, the final face position, which is a process from positioning coarse to fine.
  • Convolutional neural network CNN is used to express facial features. Compared with the face detection method based on Adaboost or SVM in related art, it has stronger Lu detection for scenes such as side face, dark light and blur.
  • the bar-shaped, convolutional neural network with a three-level cascade structure can ensure the accuracy of recognition;
  • the initial positioning and precise positioning of the face frame ie, the face area
  • a classification branch and a regression branch respectively, and the two branches share the middle layer, compared with some face detection methods appearing in the related art.
  • the model used (such as a model based on deep learning) reduces the size of the model, making the detection faster;
  • the first-level network in the three-level cascade structure of the present application adopts a full convolutional neural network instead of the sliding window in the related art, and the full convolutional neural network directly calls the GPU for processing.
  • the process of generating a candidate face frame is greatly accelerated.
  • the parameters in the convolutional neural network may be learned by training the deployed convolutional neural network on the server through the pictures in the picture set to determine the convolutional neural network.
  • the value of the parameter, the picture in the picture collection is an image that includes some or all of the face regions.
  • the above learning process mainly includes selecting the appropriate training data and training to obtain the parameter values in two parts.
  • the data of the above convolutional neural network can be divided into three categories: positive samples, regression Samples and negative samples. These three types of samples are based on the face region (ie, face frame) identified in the sample and the IoU (Intersection over Union) of the real face region. IoU defines the degree of overlap between the two frames. The ratio of the common area A ⁇ B of the face area A frame and the real face area B frame (ie, the overlapping parts) to the sum of the area of the sample face area and the real face area A ⁇ B, namely:
  • a ⁇ B is the common area of the sample face area A frame and the real face area B frame
  • a ⁇ B is the A frame and the B frame possession. The total area.
  • the dotted frame is a real ground truth (real face area), and the solid line frame is a generated sample frame (ie, a sample face area), and when performing training, it can be viewed from the figure. 5 obtain the sample data used for training, such as inputting the sample face region into the convolutional neural network.
  • the three types of samples can be defined as follows: positive samples, samples with IoU greater than 0.7; regression samples, samples with IoU between 0.5 and 0.7; negative samples with IoU less than Sample of 0.25.
  • the prepared neural network can be trained using the prepared data.
  • the network structure used in the present application may be a dual branch structure.
  • One of the branches is the Face Classification, which is used to determine whether the current input contains a face, to obtain a candidate face frame set, to obtain a face frame set, and another branch to be a Face Box Regression. After the initial face frame coordinates are given in the classification branch, the coordinate adjustment of the face region is performed to obtain an accurate face frame position.
  • the goal of the three Net face classification branch optimization in Figure 6 is to minimize the error softmax loss, and the softmax expression of the final classification neuron is:
  • h is the result
  • is the model parameter
  • k is the number of states to be estimated.
  • M is the number of samples used in a forward process
  • x i is the i-th input
  • T is the parameter, Used to normalize the probability distribution so that the sum of the probabilities is 1.
  • the candidate frame obtained for the face classification branch may include information of four dimensions, namely (x i , y i , w i , h i ), as shown in FIG. 7 , in one image
  • the position of the face frame is shown in the solid line frame in Figure 7, and the dotted line frame is an example of a sample selected.
  • the Euclidean Loss function to be optimized (Euclidean Loss) is:
  • the above dimension information uses a relative quantity, taking the first component of z i as an example, namely:
  • x′′ is the vertex coordinates of the selected sample frame.
  • the parameters in the convolutional neural network can be initialized first, and then the sample picture is input into the convolutional neural network to obtain the output of the convolutional neural network (ie, face localization).
  • the output result and the actual result are calculated by the above two formulas, and if the error is within the allowable range, the current parameter is Reasonable; if the error is not within the allowable range, adjust the parameters according to the error magnitude, then re-enter the sample picture, and again calculate the error and other information through the above two formulas until the actual result is adjusted.
  • the result of the subsequent convolutional neural network is within the allowable range.
  • the face region can be identified by the method provided by the present application.
  • the method provided by the present application E.g:
  • the positioning request mainly includes but is not limited to the following sources:
  • the face location request initiated by the client B on the terminal to the terminal may be received, the client B It can be a client that needs to detect a face in real time, such as live beauty, face tracking, etc.
  • the terminal B communicates with the terminal A (for example, by WIFI, Bluetooth, NFC, etc.), the terminal A received face location request initiated by terminal B;
  • the third-level convolutional neural network of the present application works in a cascade manner, and the face frame set 1 of the first-level convolutional neural network can be used as the second level in the third-level convolutional neural network.
  • the input of the convolutional neural network net-2 ie, the second convolutional neural network
  • the output of the second-stage convolutional neural network for filtering and screening can be used as the third-level convolutional neural network (third volume).
  • the input of the neural network uses the output of the third-level convolutional neural network filter and filter as the final result.
  • the first convolutional neural network net-1 in the third-level convolutional neural network may be used to call the graphics processor to convolute the target image to obtain a volume.
  • the result of the product wherein the convolutional neural network comprises a first convolutional neural network; determining, according to the convolution result, the confidence of the first region in the target image as a face region; and determining the face region in the first region according to the confidence.
  • the server calls the graphics processor through the first convolutional neural network to convolve the target image, and obtains the convolution result
  • the convolution algorithm on the first convolutional neural network can be executed by calling the graphics processor to target the image.
  • Each of the first regions performs a class of feature recognition to obtain a convolution result, and the convolution result is used to indicate features of the first region of the class of features.
  • the first region in the target image is determined as the confidence of the face region according to the convolution result
  • the confidence of the first region as the face region may be determined according to the feature of the first region of the type of features.
  • the input picture has a parameter of 12*12*3, and “12*12” indicates that the input picture has a pixel size of at least 12*12 (ie, the third Threshold), that is, the minimum face area that supports recognition is "12*12", "3" is represented as a 3-channel image;
  • the first-level convolutional neural network is used for relatively coarse-grained face features (ie, the above).
  • the identification of a class of features, for each region in the picture includes the identified features, and then uses a pre-set feature matching algorithm to determine the confidence that the region is the face region.
  • the first area with a confidence greater than the first threshold is placed in the candidate face frame set 1 (the area in the set is recorded as the second area).
  • the position of the face reference feature in the first area may be The first area is positionally adjusted such that the face reference feature is located at a preset position in the first position after the position adjustment.
  • the face region when the face region is determined in the first region according to the confidence level, the face region can be determined in the first region after the position adjustment according to the confidence level.
  • the second area in the first area may be used (ie, the confidence is greater than the first threshold)
  • the first area adjusts the position to avoid waste of resources.
  • the face region when the face region is determined in the first region according to the confidence level, the face region may be determined in the second region after the position adjustment according to the confidence level.
  • the above-mentioned face reference feature may be a facial feature of a face (such as a nose, eyes, mouth, eyebrows, etc.), and a fixed facial feature is relatively fixed on the face of the person, for example, for the nose, generally located on the face.
  • the centrally located position i.e., after identifying the nose in the first region, the first region can be adjusted such that the nose is centered in the adjusted first region.
  • the convolutional neural network of the present application may be a three-level convolutional neural network, and the first-level convolutional neural network mainly performs preliminary identification of the facial region, and obtains the candidate face frame set 1 described above.
  • the above-mentioned face frame set 1 can be input as the second convolutional neural network, and the second region in the face frame set 1 is determined by the second convolutional neural network.
  • the second region is the region in the first region whose confidence is greater than the first threshold.
  • the area size of the second area may be adjusted to a fourth threshold before the second convolutional neural network determines the confidence of the second area as the face area, and the fourth threshold is greater than the third threshold, for example, the pixel size is adjusted to “24” *24" 3-channel image, and then feature recognition of the region-adjusted second region through the second convolutional neural network, the feature type identified here and the characteristics identified by the first-stage convolutional neural network Different types, after the identification is completed, the confidence of the second region as the face region may be determined according to the identified features, for example, by using a preset feature matching algorithm.
  • the region with the confidence greater than the second threshold in the second region may be placed into the face frame set 2 (the region in the set is recorded as the first Three areas).
  • the face region in the third region can then be identified by the third convolutional neural network.
  • the third area in the face frame set 2 may be adjusted according to the foregoing position adjustment method for the second area.
  • the area size of the third area may be adjusted to a fifth threshold, and the fifth threshold is greater than a fourth threshold, for example, the third
  • the image adjusted to the area of "48*48" is used as the input of the third convolutional neural network, and the third region subjected to the area resizing is characterized by the third convolutional neural network, and the feature type identified here is the same as the foregoing
  • the first-level convolutional neural network and the second-level convolutional neural network identify different types of features, and after the identification is completed, the face region in the third region can be determined according to the identified features, for example, by preset feature matching.
  • the algorithm calculates the calculation matching degree, and uses the third area with the highest matching degree as the face area.
  • the features identified by the first-stage convolutional neural network are relatively simple, and the discriminant threshold can be set loosely, so that a large number of non-face windows can be eliminated while maintaining a high recall rate;
  • the polar convolutional neural network and the second polar convolutional neural network can be designed to be more complex, but sufficient efficiency can be ensured due to the need to process the remaining windows.
  • cascading can help to combine the poor performance of the classifier, and at the same time, can obtain a certain efficiency guarantee. Because the image pixels of each level input are different in size, the network can learn multi-scale feature combination, which is easy to complete. The final recognition of the face.
  • the depth models in the related art are relatively large (the number of convolutional neural networks is large). If the model in the related art exceeds 15 MB, the face detection speed is relatively slow (more than 300 ms on the mainstream PC), and the real-time performance cannot be satisfied. Requirements.
  • the deep network architecture of the cascading result used in this application has the characteristics of high detection rate, low false detection, fast speed (less than 40ms on mainstream PC), small model, etc., which fully compensates for the shortcomings of the existing face detection methods.
  • the server returning the positioning result includes: returning location information of the face region located by the convolutional neural network, wherein the location information is used to indicate the location of the face region in the target image.
  • the number of faces. (x i , y i )(x i , y i ) represents the image coordinates of the upper left vertex of the face frame, and w i and h i represent the width and height of the face frame, respectively.
  • the face area as shown in the right figure is obtained, and the position information is returned to the object that initiated the request.
  • the above location information is information that can uniquely identify a face region in the image
  • the above (x i , y i , w i , h i ) is a representative representation of the location information.
  • the method can be adjusted as needed, such as returning the coordinates of any of the lower left corner, the lower right corner, and the lower right corner, and returning the width and height of the face frame; or the coordinates of the center point of the area, and returning the width of the face frame.
  • height you can also return the coordinates of any two points in the lower left corner, upper left corner, lower right corner, and lower right corner.
  • the application of face key point location, living body detection, face recognition and retrieval can be completed.
  • the eye in the face area can be located according to the relevant algorithm. , nose, mouth, eyebrows and other features.
  • CNN convolutional neural network
  • Step S902 learning the value of the parameter in the convolutional neural network.
  • the convolutional neural network can be applied to products in the above-mentioned public domain intelligent monitoring, hospital patient identification, station or airport automatic identification and other scenarios.
  • the recognition of the face area is completed by the following steps:
  • Step S904 inputting an image P to the first-level convolutional neural network net-1 of the server, and the face classification branch of net-1 will output a probability map Prob (as shown in FIG. 10), and a point corresponding person in the Prob The likelihood that the face will appear somewhere in the image P (ie confidence).
  • Set the threshold cls-1 to retain the position greater than cls-1 in Prob, assuming that the face frame is There are a total of m, and the face frame set is R 1 (that is, the candidate face frame set 1).
  • Step S906 the return branch R to adjust the net-1 by the human face of each block in a frame the face of the individual positions, to obtain a more accurate face frame set
  • Step S908 Each face frame performs a non-maximum suppression (NMS) process, that is, when the IoU of the two sample frames is greater than the threshold nms-1, the frame with low confidence is deleted, and the candidate after the process is deleted.
  • NMS non-maximum suppression
  • Step S910 Each sub-image in the corresponding original image P is scaled, such as an image that is scaled to a length of 24 and a width of 24, sequentially input to the second-level convolutional neural network net-2, and set a threshold cls-2, the person passing the net-2 Face classification branch, you can get The confidence of each candidate frame is preserved, and the face frame of the cls-2 with greater confidence is retained, and a new face frame set R 2 (ie, candidate face frame set 2) is obtained.
  • a threshold cls-2 the person passing the net-2 Face classification branch
  • Step S912 the adjustment of the return branch R in each individual face frame 2 by face framing net-2 position to give a more accurate face frame set
  • Step S914 In each face, the non-maximum suppression NMS is performed with the threshold nms-2, and the candidate box set is obtained.
  • Step S916 The corresponding sub-images are scaled into an image with a length and a width of 48, and the third-level convolutional neural network net-3 is input, and the threshold cls-3 is set, and the face classification branch of the net-3 can be obtained.
  • the confidence level of each candidate frame is reserved, and the face frame of the cls-3 with greater confidence is retained to obtain a new face frame set R 3 .
  • Step S918 the regression block 3 of each branch of facial adjust R by net-3 position of the face frame, to obtain a more accurate face frame set
  • Step S920 In each face, the non-maximum suppression NMS is performed with the threshold nms-3, and the face frame set is obtained. That is, each face position in the image P, such as the face area shown in FIG. After the identification of the face area is completed, the matching software in the above-mentioned public domain intelligent monitoring, hospital patient identification, station or airport automatic identification, etc., will identify the face area and each database in the database. The faces of the recorded identity are matched to identify the identity of the person to which each face region belongs.
  • the technical solution provided by the embodiment of the present application can provide services for various scenarios in the form of an SDK, which can make the detection rate of the face detection high and the false detection low, so that the face detection based on the deep learning can be performed on the mobile terminal in real time. become possible.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present application which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present application.
  • a computer apparatus for implementing the above method for determining a face area is also provided.
  • 11 is a schematic diagram of an optional computer device (or a device for determining a face region), including a memory and a processor, the processor being configured to run a computer program saved in the memory to receive: according to an embodiment of the present application: a positioning request, wherein the positioning request is used to request to locate a face region in the target image; the face positioning operation is performed on the target image by using a convolutional neural network to obtain a positioning result, wherein the convolutional neural network is used to call the graphics processor The convolution operation is performed on the target image, and the face positioning operation includes a convolution operation; and when the positioning result is used to indicate that the face region exists in the target image, the positioning result is returned.
  • the above computer program may include the following software modules: a receiving unit 112, a positioning unit 114, and a return unit 116.
  • the receiving unit 112 is configured to receive a positioning request, where the positioning request is used to request to locate a face region in the target image;
  • the positioning unit 114 is configured to perform a face positioning operation on the target image by using a convolutional neural network to obtain a positioning result, and the convolutional neural network is used to invoke a graphics processor to perform a convolution operation on the target image, and the face positioning operation includes a convolution operation. ;
  • the returning unit 116 is configured to return a positioning result if the positioning result is used to indicate that the presence of the face area in the target picture.
  • the receiving unit 112 in this embodiment may be used to perform step S302 in the embodiment of the present application.
  • the positioning unit 114 in this embodiment may be used to perform step S304 in the embodiment of the present application.
  • the returning unit 116 can be used to perform step S306 in the embodiment of the present application.
  • the foregoing modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the contents disclosed in the foregoing embodiments. It should be noted that the foregoing module may be implemented in a hardware environment as shown in FIG. 2 as part of a computer device, and may be implemented by software or by hardware.
  • the face image is operated by the convolutional neural network to obtain a positioning result, and when the positioning result is used to indicate that the face area exists in the target picture, the positioning result is returned.
  • the convolution operation is directly performed by calling the graphics processor through the full convolutional network in the convolutional neural network, and the hardware acceleration method is adopted instead of
  • the software processing method of the CPU scanning by region can solve the technical problem of poor real-time detection of the face detection in the related art, thereby achieving the technical effect of improving the real-time performance of the face detection.
  • the face detection algorithm in the related art has many problems in the general application scenario.
  • the feature-based face detection has a fast detection speed, but for a slightly complex scene, the detection rate of the algorithm is low and lacks robustness;
  • the Adaboost face detection algorithm is small in model and fast in detection speed, it is less robust to complex scenes, such as face detection in extreme scenes, such as masks, black-rimmed glasses, and blurred images. .
  • the candidate face frame set is output through net-1, and the candidate set is input to net-2 to obtain a more accurate candidate face frame set, and then the obtained candidate set is input. Go to net-3 and get the final set of face frames, which is the final face position. This is a process from coarse to fine.
  • the problem of poor real-time performance in related technologies can be solved under the premise of ensuring robustness, detection rate and accuracy, which are mainly embodied as follows:
  • Convolutional neural network CNN is used to express facial features. Compared with the face detection method based on Adaboost or SVM in related art, it has stronger Lu detection for scenes such as side face, dark light and blur.
  • the bar-shaped, convolutional neural network with a three-level cascade structure can ensure the accuracy of recognition;
  • the initial positioning and precise positioning of the face frame ie, the face area
  • a classification branch and a regression branch respectively, and the two branches share the middle layer, compared with some face detection methods appearing in the related art.
  • the model used (such as a model based on deep learning) reduces the size of the model, making the detection faster;
  • the first-level network in the three-level cascade structure of the present application adopts a full convolutional neural network instead of the traditional sliding window, and the full convolutional neural network directly calls the graphics processor GPU (The Graphics Processing Unit is processed so that the process of generating a candidate face frame is greatly accelerated.
  • the parameters in the convolutional neural network may be learned as follows: the processor is implemented by running a computer program saved in the memory: passing the picture in the picture set before receiving the positioning request The convolutional neural network is trained to determine the values of the parameters in the convolutional neural network, wherein the pictures in the set of pictures are images that include some or all of the face regions.
  • the above learning process mainly includes selecting the appropriate training data and training to obtain the parameter values in two parts.
  • the face region can be identified by the solution provided by the present application.
  • the solution provided by the present application E.g:
  • Location requests mainly include but are not limited to the following sources:
  • the solution of the present application can be integrated on the terminal or installed on the terminal in the form of the client A
  • the face location request initiated by the client B on the terminal to the terminal can be received
  • the client B It can be a client that needs to detect a face in real time, such as live beauty, face tracking, etc.
  • the terminal B communicates with the terminal A (for example, by WIFI, Bluetooth, NFC, etc.), the terminal A received face location request initiated by terminal B;
  • the third-level convolutional neural network of the present application works in a cascade manner, and the face frame set 1 of the first-level convolutional neural network can be used as the second-level convolutional neural network net-2 in the third-order convolutional neural network (ie, The input of the second convolutional neural network) is further filtered and filtered.
  • the output of the second-stage convolutional neural network for filtering and filtering can be used as the input of the third-order convolutional neural network (third convolutional neural network).
  • the output of the third-level convolutional neural network filter screening is the final result.
  • the processor is further configured to run a computer program saved in the memory: the convolution operation is performed on the target image by calling the graphics processor through the first convolutional neural network to obtain a convolution result, wherein the convolutional neural network includes a first convolutional neural network; determining, according to the convolution result, a confidence that the first region in the target picture is a face region; and determining a face region in the first region according to the confidence.
  • the processor is further configured to run a computer program saved in the memory: positionally adjusting the first region according to the position of the face reference feature in the first region, so that the face reference feature is located after the position adjustment
  • the preset position in the first area includes: determining the face area in the first area after the position adjustment according to the confidence level.
  • the processor is further configured to execute a computer program saved in the memory: executing a convolution algorithm on the first convolutional neural network by calling the graphics processor to perform a class on each of the first regions in the target image Identifying the feature, obtaining a convolution result, wherein the convolution result is used to indicate the feature of the first region of the one type of feature; and determining the confidence of the first region as the face region according to the feature of the first region of the class of features degree.
  • the convolutional neural network further comprises a second convolutional neural network and a third convolutional neural network, wherein the processor is further configured to run the following computer program saved in the memory: determining by the second convolutional neural network The second area is a confidence level of the face area, wherein the second area is an area in the first area with a confidence greater than the first threshold; the third convolutional neural network identifies the face area in the third area, wherein the third area The area is an area in the second area where the confidence is greater than the second threshold.
  • the region size of the second region is adjusted to a fourth threshold, and the fourth threshold is greater than a third threshold, for example, the pixel size is adjusted to
  • the "24*24" 3-channel image is then characterized by the second convolutional neural network to identify the second region after the region size adjustment.
  • the feature type identified here is identified by the first-order convolutional neural network. The feature types are different.
  • the confidence of the second region as the face region may be determined according to the identified features, and may be calculated by a preset feature matching algorithm.
  • the area size of the first area is not less than a third threshold
  • the processor is further configured to run a computer program saved in the memory: determining that the second area is a face area by the second convolutional neural network Before the degree, the area size of the second area is adjusted to a fourth threshold, wherein the fourth threshold is greater than the third threshold; the second area after the area size adjustment is characterized by the second convolutional neural network, and is identified according to the The feature is determined to be the confidence of the second region as the face region; the region size of the third region is adjusted to a fifth threshold, wherein the fifth threshold is greater than the fourth threshold; and the size of the passing region is adjusted by the third convolutional neural network
  • the third area performs feature recognition, and determines a face area in the third area based on the identified features.
  • the area size of the third area may be adjusted to a fifth threshold, and the fifth threshold is greater than a fourth threshold, for example, the third area is adjusted.
  • the image of "48*48" is used as the input of the third convolutional neural network, and the third region subjected to the area resizing is characterized by the third convolutional neural network, and the feature type identified here is the first type
  • the feature types identified by the level convolutional neural network and the second-level convolutional neural network are different.
  • the face region in the third region can be determined according to the identified features, for example, by using a preset feature matching algorithm. The calculation of the matching degree is calculated, and the third area with the highest matching degree is used as the face area.
  • the features identified by the first-stage convolutional neural network are relatively simple, and the discriminant threshold can be set loosely, so that a large number of non-face windows can be eliminated while maintaining a high recall rate;
  • the polar convolutional neural network and the second polar convolutional neural network can be designed to be more complex, but sufficient efficiency can be ensured due to the need to process the remaining windows.
  • cascading can help to combine the poor performance of the classifier, and at the same time, can obtain a certain efficiency guarantee. Because the image pixels of each level input are different in size, the network can learn multi-scale feature combination, which is easy to complete. The final recognition of the face.
  • the depth models in the related art are relatively large (the number of convolutional neural networks is large). If the model in the related art exceeds 15 MB, the face detection speed is relatively slow (more than 300 ms on the mainstream PC), and the real-time performance cannot be satisfied. Requirements.
  • the deep network architecture of the cascading result used in this application has the characteristics of high detection rate, low false detection, fast speed (less than 40ms on mainstream PC), small model, etc., which fully compensates for the shortcomings of the existing face detection methods.
  • the processor is further configured to run a computer program saved in the memory: returning location information of the face region located by the convolutional neural network, wherein the location information is used to indicate the location of the face region in the target image .
  • the number of faces. (x i , y i )(x i , y i ) represents the image coordinates of the upper left vertex of the face frame, and w i and h i represent the width and height of the face frame, respectively.
  • the above location information is information that can uniquely identify a face region in the image
  • the above (x i , y i , w i , h i ) is a representative representation of the location information.
  • the method can also be adjusted as needed, such as returning the coordinates of any of the lower left corner, the lower right corner, and the lower right corner, and returning the width and height of the face frame; or the coordinates of the center point of the area, and returning to the face frame. Width and height; you can also return the coordinates of any two points in the lower left corner, upper left corner, lower right corner, and lower right corner.
  • the application of face key point location, living body detection, face recognition and retrieval can be completed.
  • the eye in the face area can be located according to the relevant algorithm. , nose, mouth, eyebrows and other features.
  • CNN convolutional neural network
  • the foregoing modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the contents disclosed in the foregoing embodiments. It should be noted that the foregoing module may be implemented in a hardware environment as shown in FIG. 2 as part of the device, and may be implemented by software or by hardware, where the hardware environment includes a network environment.
  • a storage medium also referred to as a memory
  • the storage medium comprising a stored program, wherein the program is configured to execute any of the methods described above at runtime.
  • a server or terminal (also referred to as a computer device) for implementing the above-described method for determining a face area.
  • FIG. 12 is a structural block diagram of a terminal according to an embodiment of the present application.
  • the terminal may include: one or more (only one shown in FIG. 12) processor 1201, memory 1203, and transmission device. 1205 (such as the computer device in the above embodiment), as shown in FIG. 12, the terminal may further include an input and output device 1207.
  • the memory 1203 can be used to store a software program and a module, such as a program instruction/module corresponding to the method for determining a face area in the embodiment of the present application, and the processor 1201 executes by executing a software program and a module stored in the memory 1203.
  • the memory 1203 may include a high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 1203 can further include memory remotely located relative to processor 1201, which can be connected to the terminal over a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the above-mentioned transmission device 1205 is used to receive or transmit data via a network, and can also be used for data transmission between the processor and the memory.
  • Specific examples of the above network may include a wired network and a wireless network.
  • the transmission device 1205 includes a Network Interface Controller (NIC) that can be connected to other network devices and routers via a network cable to communicate with the Internet or a local area network.
  • NIC Network Interface Controller
  • transmission device 1205 is a radio frequency (RF) module for communicating with the Internet wirelessly.
  • RF radio frequency
  • the memory 1203 is used to store an application.
  • the processor 1201 may invoke the application stored in the memory 1203 through the transmission device 1205 to perform the steps of: receiving a positioning request, wherein the positioning request is used to request to locate a face region in the target picture; and the target is convolved through the neural network.
  • the image performs a face positioning operation to obtain a positioning result, wherein the convolutional neural network is used to call a graphics processor to perform a convolution operation on the target image, the face positioning operation includes a convolution operation, and the positioning result is used to represent the target image. In the case where there is a face area, the positioning result is returned.
  • the processor 1201 is further configured to: perform a convolution operation on the target image by calling the graphics processor through the first convolutional neural network to obtain a convolution result, where the convolutional neural network includes the first convolutional neural network;
  • the convolution result determines the first region in the target picture as the confidence of the face region; the face region is determined in the first region according to the confidence.
  • the face image when receiving the positioning request, the face image is operated by the convolutional neural network to obtain a positioning result, and when the positioning result is used to indicate that the face area exists in the target picture, the method returns.
  • the result of the positioning, in the process of face recognition, in the preliminary identification is to directly call the graphics processor convolution operation through the full convolution network in the convolutional neural network, using this hardware acceleration method, and It is not a software processing method that scans the area by region by the CPU, and can solve the technical problem of poor real-time detection of the face detection in the related art, thereby achieving the technical effect of improving the real-time performance of the face detection.
  • the terminal can be a smart phone (such as an Android mobile phone, an iOS mobile phone, etc.), a tablet computer, a palm computer, and a mobile Internet device (MID). Terminal equipment such as PAD.
  • Figure 12 does not limit the structure of the above computer device.
  • the terminal may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 12, or have a different configuration than that shown in FIG.
  • Embodiments of the present application also provide a storage medium.
  • the foregoing storage medium may be used to execute program code of a method for determining a face area.
  • the foregoing storage medium may be located on at least one of the plurality of network devices in the network shown in the foregoing embodiment.
  • the storage medium is arranged to store program code for performing the following steps:
  • the storage medium is further arranged to store program code for performing the following steps:
  • the first convolutional neural network is used to invoke a graphics processor to perform a convolution operation on the target image to obtain a convolution result, and the convolutional neural network includes a first convolutional neural network;
  • the foregoing storage medium may include, but not limited to, a USB flash drive, a Read-Only Memory (ROM), a Random Access Memory (RAM), a mobile hard disk, and a magnetic memory.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • a mobile hard disk e.g., a hard disk
  • magnetic memory e.g., a hard disk
  • the integrated unit in the above embodiment if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in the above-described computer readable storage medium.
  • the technical solution of the present application in essence or the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium.
  • a number of instructions are included to cause one or more computer devices (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the disclosed client may be implemented in other manners.
  • the embodiments of the computer device described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, i.e., may be located at one location, or may be distributed over multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de détermination d'une zone de visage humain, un support de stockage et un dispositif informatique. Le procédé consiste à : recevoir une demande de positionnement, la demande de positionnement étant utilisée pour demander de situer une zone de visage humain dans une image cible ; effectuer une opération de positionnement de visage humain sur l'image cible au moyen d'un réseau neuronal convolutionnel pour obtenir un résultat de positionnement, le réseau neuronal convolutionnel étant utilisé pour appeler un processeur graphique pour effectuer une opération de convolution sur une image cible, et l'opération de positionnement de visage humain comprenant une opération de convolution ; et là où le résultat de positionnement est utilisé pour indiquer qu'il existe une zone de visage humain située dans l'image cible, renvoyer un résultat de positionnement. Au moyen de la présente invention, le problème technique de la mauvaise performance en temps réel de la détection de visage humain dans l'état de la technique pertinent est résolu.
PCT/CN2018/079551 2017-04-11 2018-03-20 Procédé de détermination d'une zone de visage humain, support de stockage et dispositif informatique WO2018188453A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710233590.6A CN107145833A (zh) 2017-04-11 2017-04-11 人脸区域的确定方法和装置
CN201710233590.6 2017-04-11

Publications (1)

Publication Number Publication Date
WO2018188453A1 true WO2018188453A1 (fr) 2018-10-18

Family

ID=59773604

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/079551 WO2018188453A1 (fr) 2017-04-11 2018-03-20 Procédé de détermination d'une zone de visage humain, support de stockage et dispositif informatique

Country Status (2)

Country Link
CN (1) CN107145833A (fr)
WO (1) WO2018188453A1 (fr)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345770A (zh) * 2018-11-14 2019-02-15 深圳市尼欧科技有限公司 一种孩童遗留车内报警系统及孩童遗留车内报警方法
CN110472728A (zh) * 2019-07-30 2019-11-19 腾讯科技(深圳)有限公司 目标信息确定方法、目标信息确定装置、介质及电子设备
CN110879983A (zh) * 2019-11-18 2020-03-13 讯飞幻境(北京)科技有限公司 一种人脸特征关键点的提取方法和一种人脸图像合成方法
CN110969189A (zh) * 2019-11-06 2020-04-07 杭州宇泛智能科技有限公司 人脸检测方法、装置及电子设备
CN111274886A (zh) * 2020-01-13 2020-06-12 天地伟业技术有限公司 一种基于深度学习的行人闯红灯违法行为分析方法及系统
CN111401247A (zh) * 2020-03-17 2020-07-10 杭州趣维科技有限公司 一种基于级联卷积神经网络的人像分割方法
CN111428628A (zh) * 2020-03-23 2020-07-17 北京每日优鲜电子商务有限公司 人脸检测方法、装置、设备及存储介质
CN111553216A (zh) * 2020-04-20 2020-08-18 维沃移动通信有限公司 图像处理方法、电子设备及存储介质
CN111753598A (zh) * 2019-03-29 2020-10-09 中国联合网络通信集团有限公司 人脸检测方法及设备
CN111767832A (zh) * 2020-06-28 2020-10-13 北京百度网讯科技有限公司 模型生成方法、装置、电子设备及存储介质
CN111770299A (zh) * 2020-04-20 2020-10-13 厦门亿联网络技术股份有限公司 一种智能视频会议终端的实时人脸摘要服务的方法及系统
CN111783601A (zh) * 2020-06-24 2020-10-16 北京百度网讯科技有限公司 人脸识别模型的训练方法、装置、电子设备及存储介质
CN111814568A (zh) * 2020-06-11 2020-10-23 开易(北京)科技有限公司 一种用于驾驶员状态监测的目标检测方法及装置
CN111931661A (zh) * 2020-08-12 2020-11-13 桂林电子科技大学 一种基于卷积神经网络的实时口罩佩戴检测方法
CN113095284A (zh) * 2021-04-30 2021-07-09 平安国际智慧城市科技股份有限公司 人脸选取方法、装置、设备及计算机可读存储介质
CN113128320A (zh) * 2020-01-16 2021-07-16 浙江舜宇智能光学技术有限公司 基于tof相机的人脸活体检测方法、检测装置及电子设备
CN113343927A (zh) * 2021-07-03 2021-09-03 郑州铁路职业技术学院 一种适用于面瘫患者的智能化人脸识别方法和系统
CN113361413A (zh) * 2021-06-08 2021-09-07 南京三百云信息科技有限公司 一种里程显示区域检测方法、装置、设备及存储介质
CN114764925A (zh) * 2020-12-30 2022-07-19 北京眼神智能科技有限公司 口罩佩戴检测方法、装置、计算机可读存储介质及设备
CN116012649A (zh) * 2022-12-30 2023-04-25 东莞理工学院 用于医学图像的集成学习投票分类方法、系统及终端
CN118254663A (zh) * 2024-05-31 2024-06-28 天和骏行智能装备(福建)有限公司 一种电力工程抢险车

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107145833A (zh) * 2017-04-11 2017-09-08 腾讯科技(上海)有限公司 人脸区域的确定方法和装置
CN107644209A (zh) 2017-09-21 2018-01-30 百度在线网络技术(北京)有限公司 人脸检测方法和装置
CN107644208A (zh) * 2017-09-21 2018-01-30 百度在线网络技术(北京)有限公司 人脸检测方法和装置
CN107704836B (zh) * 2017-10-17 2021-10-08 电子科技大学 基于物体检测的疲劳驾驶检测方法
CN108010078B (zh) * 2017-11-29 2020-06-26 中国科学技术大学 一种基于三级卷积神经网络的物体抓取检测方法
CN107992844B (zh) * 2017-12-14 2022-01-18 离娄科技(北京)有限公司 基于深度学习的人脸识别系统及方法
US10574890B2 (en) * 2018-01-12 2020-02-25 Movidius Ltd. Methods and apparatus to operate a mobile camera for low-power usage
CN108363962B (zh) * 2018-01-25 2021-01-26 南京邮电大学 一种基于多层次特征深度学习的人脸检测方法及系统
CN110147703B (zh) * 2018-08-20 2023-10-31 腾讯科技(深圳)有限公司 人脸关键点检测方法、装置及存储介质
CN109146906B (zh) * 2018-08-22 2021-03-23 Oppo广东移动通信有限公司 图像处理方法和装置、电子设备、计算机可读存储介质
US10915995B2 (en) 2018-09-24 2021-02-09 Movidius Ltd. Methods and apparatus to generate masked images based on selective privacy and/or location tracking
CN109784207B (zh) * 2018-12-26 2020-11-24 深圳云天励飞技术有限公司 一种人脸识别方法、装置及介质
CN109993086B (zh) * 2019-03-21 2021-07-27 北京华捷艾米科技有限公司 人脸检测方法、装置、系统及终端设备
CN110096964B (zh) * 2019-04-08 2021-05-04 厦门美图之家科技有限公司 一种生成图像识别模型的方法
CN110135279B (zh) * 2019-04-23 2021-06-08 深圳神目信息技术有限公司 一种基于人脸识别的预警方法、装置、设备及计算机可读介质
CN110046602A (zh) * 2019-04-24 2019-07-23 李守斌 基于分类的深度学习人脸检测方法
CN110236530A (zh) * 2019-06-20 2019-09-17 武汉中旗生物医疗电子有限公司 一种心电信号qrs波群定位方法、装置及计算机存储介质
CN111079625B (zh) * 2019-12-11 2023-10-27 江苏国光信息产业股份有限公司 一种摄像头自动跟随人脸转动的控制方法
CN111583671B (zh) * 2020-06-05 2022-05-31 南京信息职业技术学院 一种毫米波雷达路口车流量监测方法及其系统
CN112257491B (zh) * 2020-08-20 2021-12-24 江苏正赫通信息科技有限公司 自适应调度人脸识别和属性分析方法及装置
CN112967292B (zh) * 2021-03-01 2022-03-04 焦点科技股份有限公司 一种针对电商产品的自动抠图与评分方法及系统
CN113222973B (zh) * 2021-05-31 2024-03-08 深圳市商汤科技有限公司 图像处理方法及装置、处理器、电子设备及存储介质
CN117503062B (zh) * 2023-11-21 2024-04-09 欣颜时代(广州)技术有限公司 美容仪的神经检测控制方法、装置、设备和存储介质

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148079A1 (en) * 2014-11-21 2016-05-26 Adobe Systems Incorporated Object detection using cascaded convolutional neural networks
CN105912990A (zh) * 2016-04-05 2016-08-31 深圳先进技术研究院 人脸检测的方法及装置
CN105975961A (zh) * 2016-06-28 2016-09-28 北京小米移动软件有限公司 人脸识别的方法、装置及终端
CN106295502A (zh) * 2016-07-25 2017-01-04 厦门中控生物识别信息技术有限公司 一种人脸检测方法及装置
CN106295678A (zh) * 2016-07-27 2017-01-04 北京旷视科技有限公司 神经网络训练与构建方法和装置以及目标检测方法和装置
CN106355573A (zh) * 2016-08-24 2017-01-25 北京小米移动软件有限公司 图片中目标物的定位方法及装置
CN106446862A (zh) * 2016-10-11 2017-02-22 厦门美图之家科技有限公司 一种人脸检测方法及系统
US20170083752A1 (en) * 2015-09-18 2017-03-23 Yahoo! Inc. Face detection
CN107145833A (zh) * 2017-04-11 2017-09-08 腾讯科技(上海)有限公司 人脸区域的确定方法和装置
CN107688786A (zh) * 2017-08-30 2018-02-13 南京理工大学 一种基于级联卷积神经网络的人脸检测方法

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160148079A1 (en) * 2014-11-21 2016-05-26 Adobe Systems Incorporated Object detection using cascaded convolutional neural networks
US20170083752A1 (en) * 2015-09-18 2017-03-23 Yahoo! Inc. Face detection
CN105912990A (zh) * 2016-04-05 2016-08-31 深圳先进技术研究院 人脸检测的方法及装置
CN105975961A (zh) * 2016-06-28 2016-09-28 北京小米移动软件有限公司 人脸识别的方法、装置及终端
CN106295502A (zh) * 2016-07-25 2017-01-04 厦门中控生物识别信息技术有限公司 一种人脸检测方法及装置
CN106295678A (zh) * 2016-07-27 2017-01-04 北京旷视科技有限公司 神经网络训练与构建方法和装置以及目标检测方法和装置
CN106355573A (zh) * 2016-08-24 2017-01-25 北京小米移动软件有限公司 图片中目标物的定位方法及装置
CN106446862A (zh) * 2016-10-11 2017-02-22 厦门美图之家科技有限公司 一种人脸检测方法及系统
CN107145833A (zh) * 2017-04-11 2017-09-08 腾讯科技(上海)有限公司 人脸区域的确定方法和装置
CN107688786A (zh) * 2017-08-30 2018-02-13 南京理工大学 一种基于级联卷积神经网络的人脸检测方法

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345770A (zh) * 2018-11-14 2019-02-15 深圳市尼欧科技有限公司 一种孩童遗留车内报警系统及孩童遗留车内报警方法
CN111753598A (zh) * 2019-03-29 2020-10-09 中国联合网络通信集团有限公司 人脸检测方法及设备
CN110472728A (zh) * 2019-07-30 2019-11-19 腾讯科技(深圳)有限公司 目标信息确定方法、目标信息确定装置、介质及电子设备
CN110472728B (zh) * 2019-07-30 2023-05-23 腾讯科技(深圳)有限公司 目标信息确定方法、目标信息确定装置、介质及电子设备
CN110969189A (zh) * 2019-11-06 2020-04-07 杭州宇泛智能科技有限公司 人脸检测方法、装置及电子设备
CN110969189B (zh) * 2019-11-06 2023-07-25 杭州宇泛智能科技有限公司 人脸检测方法、装置及电子设备
CN110879983B (zh) * 2019-11-18 2023-07-25 讯飞幻境(北京)科技有限公司 一种人脸特征关键点的提取方法和一种人脸图像合成方法
CN110879983A (zh) * 2019-11-18 2020-03-13 讯飞幻境(北京)科技有限公司 一种人脸特征关键点的提取方法和一种人脸图像合成方法
CN111274886A (zh) * 2020-01-13 2020-06-12 天地伟业技术有限公司 一种基于深度学习的行人闯红灯违法行为分析方法及系统
CN111274886B (zh) * 2020-01-13 2023-09-19 天地伟业技术有限公司 一种基于深度学习的行人闯红灯违法行为分析方法及系统
CN113128320A (zh) * 2020-01-16 2021-07-16 浙江舜宇智能光学技术有限公司 基于tof相机的人脸活体检测方法、检测装置及电子设备
CN113128320B (zh) * 2020-01-16 2023-05-16 浙江舜宇智能光学技术有限公司 基于tof相机的人脸活体检测方法、检测装置及电子设备
CN111401247A (zh) * 2020-03-17 2020-07-10 杭州趣维科技有限公司 一种基于级联卷积神经网络的人像分割方法
CN111401247B (zh) * 2020-03-17 2023-07-28 杭州小影创新科技股份有限公司 一种基于级联卷积神经网络的人像分割方法
CN111428628A (zh) * 2020-03-23 2020-07-17 北京每日优鲜电子商务有限公司 人脸检测方法、装置、设备及存储介质
CN111770299B (zh) * 2020-04-20 2022-04-19 厦门亿联网络技术股份有限公司 一种智能视频会议终端的实时人脸摘要服务的方法及系统
CN111770299A (zh) * 2020-04-20 2020-10-13 厦门亿联网络技术股份有限公司 一种智能视频会议终端的实时人脸摘要服务的方法及系统
CN111553216A (zh) * 2020-04-20 2020-08-18 维沃移动通信有限公司 图像处理方法、电子设备及存储介质
CN111814568A (zh) * 2020-06-11 2020-10-23 开易(北京)科技有限公司 一种用于驾驶员状态监测的目标检测方法及装置
CN111814568B (zh) * 2020-06-11 2022-08-02 开易(北京)科技有限公司 一种用于驾驶员状态监测的目标检测方法及装置
CN111783601B (zh) * 2020-06-24 2024-04-26 北京百度网讯科技有限公司 人脸识别模型的训练方法、装置、电子设备及存储介质
CN111783601A (zh) * 2020-06-24 2020-10-16 北京百度网讯科技有限公司 人脸识别模型的训练方法、装置、电子设备及存储介质
CN111767832A (zh) * 2020-06-28 2020-10-13 北京百度网讯科技有限公司 模型生成方法、装置、电子设备及存储介质
CN111931661A (zh) * 2020-08-12 2020-11-13 桂林电子科技大学 一种基于卷积神经网络的实时口罩佩戴检测方法
CN114764925A (zh) * 2020-12-30 2022-07-19 北京眼神智能科技有限公司 口罩佩戴检测方法、装置、计算机可读存储介质及设备
CN113095284A (zh) * 2021-04-30 2021-07-09 平安国际智慧城市科技股份有限公司 人脸选取方法、装置、设备及计算机可读存储介质
CN113361413A (zh) * 2021-06-08 2021-09-07 南京三百云信息科技有限公司 一种里程显示区域检测方法、装置、设备及存储介质
CN113343927A (zh) * 2021-07-03 2021-09-03 郑州铁路职业技术学院 一种适用于面瘫患者的智能化人脸识别方法和系统
CN116012649A (zh) * 2022-12-30 2023-04-25 东莞理工学院 用于医学图像的集成学习投票分类方法、系统及终端
CN116012649B (zh) * 2022-12-30 2023-09-19 东莞理工学院 用于医学图像的集成学习投票分类方法、系统及终端
CN118254663A (zh) * 2024-05-31 2024-06-28 天和骏行智能装备(福建)有限公司 一种电力工程抢险车

Also Published As

Publication number Publication date
CN107145833A (zh) 2017-09-08

Similar Documents

Publication Publication Date Title
WO2018188453A1 (fr) Procédé de détermination d'une zone de visage humain, support de stockage et dispositif informatique
WO2021077984A1 (fr) Procédé et appareil de reconnaissance d'objets, dispositif électronique et support de stockage lisible
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
WO2019218824A1 (fr) Procédé d'acquisition de piste de mouvement et dispositif associé, support de stockage et terminal
CN107766786B (zh) 活性测试方法和活性测试计算设备
KR102174595B1 (ko) 비제약형 매체에 있어서 얼굴을 식별하는 시스템 및 방법
WO2022078041A1 (fr) Procédé d'entraînement de modèle de détection d'occlusion et procédé d'embellissement d'image faciale
WO2018133666A1 (fr) Procédé et appareil de suivi de cible vidéo
CN111767900B (zh) 人脸活体检测方法、装置、计算机设备及存储介质
CN110135249B (zh) 基于时间注意力机制和lstm的人体行为识别方法
WO2018133791A1 (fr) Procédé et système de discrimination de corps vivant basés sur une analyse vidéo, et support d'informations
US20140003663A1 (en) Method of detecting facial attributes
WO2022252642A1 (fr) Procédé et appareil de détection de posture de comportement basés sur une image vidéo, et dispositif et support
CN111814620A (zh) 人脸图像质量评价模型建立方法、优选方法、介质及装置
CN112446322B (zh) 眼球特征检测方法、装置、设备及计算机可读存储介质
CN112836625A (zh) 人脸活体检测方法、装置、电子设备
CN109389002A (zh) 活体检测方法及装置
WO2022267653A1 (fr) Procédé de traitement d'images, dispositif électronique et support d'enregistrement lisible par ordinateur
CN112149615A (zh) 人脸活体检测方法、装置、介质及电子设备
CN110728242A (zh) 基于人像识别的图像匹配方法、装置、存储介质及应用
WO2023279799A1 (fr) Procédé et appareil d'identification d'objet et système électronique
CN112434647A (zh) 一种人脸活体检测方法
CN112766065A (zh) 一种移动端考生身份认证方法、装置、终端及存储介质
US11605220B2 (en) Systems and methods for video surveillance
CN112700568B (zh) 一种身份认证的方法、设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18784448

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18784448

Country of ref document: EP

Kind code of ref document: A1