WO2019033715A1 - 人脸图像数据采集方法、装置、终端设备及存储介质 - Google Patents

人脸图像数据采集方法、装置、终端设备及存储介质 Download PDF

Info

Publication number
WO2019033715A1
WO2019033715A1 PCT/CN2018/074575 CN2018074575W WO2019033715A1 WO 2019033715 A1 WO2019033715 A1 WO 2019033715A1 CN 2018074575 W CN2018074575 W CN 2018074575W WO 2019033715 A1 WO2019033715 A1 WO 2019033715A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
face
original image
preset
facial
Prior art date
Application number
PCT/CN2018/074575
Other languages
English (en)
French (fr)
Inventor
朱志博
陈伟杰
吴善鹏
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to SG11201809210VA priority Critical patent/SG11201809210VA/en
Priority to US16/088,828 priority patent/US20200387748A1/en
Publication of WO2019033715A1 publication Critical patent/WO2019033715A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/169Holistic features and representations, i.e. based on the facial image taken as a whole
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to a method, an apparatus, a terminal device, and a storage medium for collecting face image data.
  • Face recognition technology is a biometric recognition technology based on human facial feature information for identification.
  • the face recognition technology specifically adopts a camera or a camera to collect an image or a video stream containing a face, and a face recognition model automatically detects a face in an image or a video stream, and then performs face recognition on the detected face.
  • face recognition technology With the development and popularization of face recognition technology, a large amount of face image data is needed to train the face recognition model to improve the accuracy of the face recognition model to recognize faces.
  • the current face image data collection process requires a lot of manpower and material resources, and the cost is high and the efficiency is low.
  • the present application provides a method, a device, a terminal device and a storage medium for collecting face image data, so as to solve the problem that the current face image data collection process is inefficient.
  • the present application provides a method for collecting facial image data, including:
  • the target face image is intercepted from the effective image by using a preset selection frame.
  • the present application provides a face image data collection device, including:
  • the original image crawling module is used to crawl the original image from the network using the image crawler tool.
  • An effective image recognition module is configured to identify the original image by using a face recognition algorithm to obtain an effective image including a face feature.
  • the effective image intercepting module is configured to intercept the target facial image from the effective image by using a preset selection frame.
  • the present application provides a terminal device including a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, the processor executing the computer readable instructions The following steps are implemented:
  • the target face image is intercepted from the effective image by using a preset selection frame.
  • the present application provides a computer readable storage medium storing computer readable instructions that, when executed by a processor, implement the following steps:
  • the target face image is intercepted from the effective image by using a preset selection frame.
  • the present application has the following advantages: in the method, device, terminal device and storage medium for collecting face image data provided by the present application, the original image can be crawled from the network by using a picture crawler tool, which can be determined according to certain The rules automatically capture a large number of original images from the network, and the data collection speed is fast. Then the face recognition algorithm is used to identify the original image to obtain the effective image containing the face feature, so that the original image without the face feature can be used as the effective image, and the collected effective image can be applied to the face recognition model training. Improve the effectiveness and accuracy of face recognition model training. Then, the target face image is intercepted from the effective image by using the preset selection frame, so that the acquired target face image is applied to the face recognition model training, which can effectively improve the accuracy of the face recognition model.
  • Embodiment 1 is a flow chart of a method for collecting face image data in Embodiment 1.
  • FIG. 2 is a specific flow chart of step S10 of FIG. 1.
  • FIG. 3 is a specific flow chart of step S20 of FIG. 1.
  • step S20 of FIG. 1 is another specific flow chart of step S20 of FIG. 1.
  • FIG. 5 is a specific flowchart of step S30 in FIG. 1.
  • Fig. 6 is a schematic block diagram of a face image data collecting device in the second embodiment.
  • FIG. 7 is a schematic block diagram of a terminal device in Embodiment 4.
  • FIG. 1 shows a method of collecting face image data in the present embodiment.
  • the face image data collecting method can quickly collect a large amount of face image data from the network, so as to perform face recognition model training based on the collected face image data.
  • the face image data collecting method includes the following steps:
  • the image crawler tool is a program that can automatically crawl the webpage address of the webpage containing the image, and download the image based on the crawled webpage address.
  • the image crawler tool only crawls the pictures in the network without crawling other data, and is highly targeted, which is beneficial to improve image collection efficiency.
  • the original image is an image downloaded from the network using the image crawler tool.
  • the image crawler tool can be used to download a large amount of original images from a social networking website, a search engine, or other websites, and the data volume is large and the acquisition process is simple and convenient.
  • the image crawler tool includes a web crawler and a picture downloading tool, and the web crawler and the image downloading tool can be integrated into one whole or separately.
  • the web crawler is a program or script that automatically grabs Internet information according to certain rules.
  • the image download tool is a program or script that automatically downloads images from the Internet based on the entered web page address.
  • the image crawler tool can adopt a distributed image crawler tool, such as a python image crawler tool, which can realize the parallel capture of the original image and improve the crawling efficiency of the original image.
  • the python image crawler tool integrates a web crawler and a picture download tool.
  • step S10 specifically includes the following steps:
  • S11 A webpage address that uses a web crawler to crawl the original image from the network.
  • the web address (Uniform/Universal Resource Locator, also referred to as the Uniform Resource Locator) is the address of a standard resource on the Internet, and the webpage address is the webpage address of the webpage where the original image is located.
  • the web crawler automatically crawls the webpage address including the original image from the Internet according to the crawler task set by the user, and does not need manual search, which is beneficial to improving data collection efficiency.
  • webpage address that uses the web crawler to crawl the original image from the network specifically includes the following steps:
  • the original webpage address is a user-defined webpage address that starts the crawling task.
  • the paging rule is a user-defined rule for paging a webpage, and can be set according to the actual source of the data, and the setting process may adopt a fixed format or an unfixed format.
  • Keywords are words that a web crawler searches during crawling a network address.
  • the keyword may be a word obtained by the user after clustering the historical data, so that the probability of searching for the valid image obtained based on the keyword is high, and if the keyword is “self-timer”, the acquired face is included. The probability of a valid image of a feature is high.
  • the web crawler is enabled to perform the crawler task, starting from the original webpage address and crawling the webpage address containing the original image based on the paging rules and keywords.
  • a preset search policy may be used to continuously crawl a new webpage address from the current page into the message queue to be downloaded, and stop executing the crawler task until the preset stop condition is satisfied.
  • the preset search strategy includes, but is not limited to, a breadth-first search strategy or a depth-first search strategy employed in the embodiment.
  • the webpage address of each original image crawled in step S11 is stored in the queue of the message to be downloaded according to the chronological order of the crawling, so as to be based on the webpage in the message queue to be downloaded when performing step S13. Address to download images.
  • the message queue to be downloaded processes the webpage address according to the advanced first-in-first method, so that the crawling webpage address and the original image downloaded based on the webpage address are processed asynchronously, which is beneficial to improving the efficiency of acquiring the original image.
  • the image downloading tool is used to crawl the original image from the webpage corresponding to the webpage address in the message queue to be downloaded.
  • the image downloading tool is a tool for downloading images in batches, and automatically downloads all the images in the page corresponding to the webpage address according to the input webpage address.
  • the image download tool can be integrated into the image crawler tool, such as the python image web crawler integrated with the image download tool; or it can be a separate image download tool, such as the NeoDownloader tool, which can quickly download the image in batches.
  • a plurality of webpage addresses including the original image are stored in the message queue to be downloaded, and the image downloading tool sequentially obtains the webpage address from the to-be-downloaded message queue and downloads the original image corresponding to the webpage address.
  • the image downloading tool obtains a webpage address from the head of the message queue to be downloaded and downloads the image in the webpage address, stores the downloaded original image in the database, and unregisters the corresponding webpage address in the message queue to be downloaded. Repeat the above steps until the webpage address does not exist in the message queue to be downloaded to obtain the original image corresponding to all the webpage addresses crawled by the image crawler.
  • the webpage address of the original image crawled by the web crawler is stored in the message queue to be downloaded, and then the image downloading tool is used to download the original image based on the webpage address obtained in the message queue to be downloaded, so that the webpage address is downloaded and original.
  • Image download asynchronous processing is beneficial to improve the acquisition efficiency of the original image.
  • S20 Identifying the original image by using a face recognition algorithm to obtain an effective image including the face feature.
  • the face recognition algorithm is an algorithm for identifying face features in an image.
  • a face recognition program is preset, and the face recognition program stores a face recognition algorithm.
  • the face recognition algorithm may be used to perform face recognition on the original image. To get a valid image that contains facial features.
  • the original image downloaded from the network by using the image crawler tool is cached in the database, and the storage address of the original image in the database is placed in the message queue to be identified, and the face recognition program is executed, which is sequentially based on the message queue to be identified.
  • the face recognition algorithm may be a face recognition algorithm based on geometric features, a face recognition algorithm based on a feature face (Eigenface), a face recognition algorithm based on an elasticity model, and a person based on a neural network (Neural Networks). Face recognition algorithm, etc.
  • the face recognition algorithm based on geometric features is a method for face recognition by extracting geometric features of organs such as eyes, ears, mouth, nose and eyebrows as classification features.
  • the face recognition algorithm based on Eigenface constructs the principal subspace according to a set of face training images.
  • the original image is projected onto the principal subspace, and a set of projection coefficients are obtained, which utilizes the projection coefficients and each
  • the face image has been compared to identify the face feature; since the pivot has the shape of the face, it is called the feature face (Eigenface).
  • the face recognition algorithm based on elastic model is to describe the object with sparse graphics, its vertices represent the multi-scale description of local energy, the edges represent the topological connection relationship and are marked with geometric distance, and then apply the plastic pattern matching technique to find the closest one. A method of knowing graphics.
  • the neural network based face recognition algorithm is a nonlinear dynamic system, which includes extracting multiple principal elements, and then using autocorrelation neural network to map to multi-dimensional space, using multi-layer perceptron to judge to identify
  • the method of face the method has good self-organization and self-adaptive ability.
  • step S20 specifically includes the following steps:
  • S211 Identifying the original image by using a face recognition algorithm to determine whether there is a facial feature in the original image.
  • the facial features are a kind of facial features, including the facial features of the five organs of the eyes, ears, mouth, nose and eyebrows.
  • the face recognition algorithm based on the feature face may be used to identify the original image to determine whether there is a facial feature in the original image, and specifically includes the following steps: First, adopt an active appearance model (Active Appearance Model, Hereinafter, AAM) detects the facial features and the feature vectors of the facial features in the original image.
  • AAM Active Appearance Model
  • PCA Principal Component Analysis
  • the K-means method is used to classify the feature vectors processed by PCA, which can realize simple and fast classification.
  • the support vector machine (SVM) is used to train the K categories of data into a classification model to identify whether the original image contains facial features based on the classification model.
  • the original image is identified by using a face recognition algorithm. If the facial features are identified in the original image, step S212 is performed. If the facial features are not found in the original image, the original image does not include the human face. A valid image of the feature, deleting the original image to save storage space in the database.
  • the integrity of the five senses is the ratio of the facial features identified in the original image to the full facial features.
  • the integrity of the five senses organ weight * organ integrity, organs including the eyes, ears, mouth, nose, eyebrows.
  • organ integrity refers to the integrity of the five organs of the eye, ear, mouth, nose and eyebrow.
  • the organ integrity is the ratio of the organ characteristics identified in the original image to the intact organ characteristics.
  • the organ weight is a user-defined weight constant, and the organ weight can be set according to the distance of the organ from the center of the face.
  • the distance of the nose from the center of the face is the closest, and the weight of the nose is the largest; accordingly, the ear is the farthest from the center of the face, and the weight of the ear is the smallest.
  • the organ integrity is 100%. If any organ shows only half of the original image, its organ integrity is 50%.
  • the preset integrity is a reference value pre-set by the user for evaluating the integrity of the facial features.
  • the preset integrity is user-defined and can be set to 80% or other values.
  • the facial integrity of the original image reaches the preset integrity, the facial features in the original image are considered to be complete, that is, the original image contains a relatively complete facial feature, which can be saved as an effective image.
  • the facial features in the original image are considered to be incomplete, and the original image cannot be used as the training data of the face recognition model, and the original image is deleted to save the database storage. space.
  • the facial integrity is 100%, and the preset integrity is 80%, the original image is saved as an effective image; if only half of the original image is in the original image Face, the facial integrity is 50%, the preset integrity is 80%, and the original image is not a valid image, and the original image is deleted to save the storage space of the database.
  • step S20 specifically includes the following steps:
  • S221 Identifying an original image by using a face recognition algorithm to determine whether a face region exists in the original image.
  • the face area is a facial feature above the neck of the person.
  • the face area includes not only five organs such as the eyes, ears, mouth, nose, and eyebrows, but also features such as the skin color of the face and the expression of the face.
  • a face recognition algorithm based on geometric features a face recognition algorithm based on Eigenface, a face recognition algorithm based on an elastic model, and a face recognition algorithm based on Neural Networks may also be adopted.
  • the face recognition algorithm recognizes the face area.
  • BP BP Propagation Neural Network
  • the BP neural network is a forward network, which generally includes an input layer, a hidden layer, and an output layer.
  • the hidden layer can be one layer, two layers or even more layers, in order to analyze the interaction between various factors, each layer is composed of several neurons, and each of the adjacent two layers has a right value connection.
  • the magnitude of the weight reflects the strength of the connection between the two neurons.
  • the calculation process of the entire network is one-way from the input layer to the hidden layer to the output layer.
  • BP neural network is essentially an input-to-output mapping by learning a large number of mappings between input and output.
  • the process of recognizing the original image by the BP neural network face recognition algorithm specifically includes the following steps: (1) Performing image compression, image sampling, and input vector normalization on the original image to obtain image features.
  • image compression uses the interpolation algorithm such as nearest neighbor interpolation, bilinear interpolation or double cubic interpolation to compress the original image to avoid the complexity of the BP neural network structure caused by a large amount of redundant information in the original image.
  • Image sampling is to compress the compressed two-dimensional image matrix into a one-dimensional column vector one by one to facilitate the input of the subsequent BP neural network.
  • Input vector normalization is to normalize the one-dimensional column vector obtained by image sampling to avoid large one-dimensional column vector values, which affects computational efficiency and convergence rate.
  • the preset probability is a probability that the user-defined evaluation has an existing face region in the original image.
  • the original image is identified by using a face recognition algorithm. If it is recognized that the original image has a face region, step S222 is performed; if the face feature is not found in the original image, the original image is not included. A valid image of the face feature, deleting the original image to save storage space in the database.
  • the face image ratio refers to the ratio of the image size corresponding to the face region to the image size of the original image.
  • the face area may be defined by a rectangular frame, and the image size corresponding to the face area is the area of the rectangular frame. Accordingly, the image size of the original image is the area of the original image. That is, the face image ratio is the ratio of the area of the face area to the area of the original image.
  • the preset ratio is a pre-set value for evaluating the original image as a valid image, and the preset ratio is a reference value, which can be customized by the user.
  • the original image ratio of the original image is determined to be a valid image including the face feature. If the ratio of the face image of the original image is not greater than the preset ratio, the area of the face region in the original image is too small. If the original image is used as the training data of the subsequent face recognition model, the face recognition model may be affected. The accuracy and training efficiency, therefore, the original image with a small face image ratio is not taken as a valid image, and the original image is deleted to save the storage space of the database.
  • steps S211-S213 may be adopted, that is, whether the original image is a valid image including a facial feature is determined by comparing the facial integrity and the preset integrity; step S221-223 may also be adopted, that is, by comparing people
  • the aspect ratio of the face image and the preset proportion value determine whether the original image is an effective image containing the facial features, and the two judgment methods can improve the accuracy of the acquired effective image for subsequent face recognition model training to a certain extent.
  • the steps S211-S213 and the steps S221-S223 can also be combined, that is, the comparison between the facial features integrity and the preset integrity is completed, and the comparison between the face image ratio and the preset ratio is satisfied.
  • the original image can be determined as a valid image, so as to improve the accuracy of the subsequent face recognition model training when the face recognition model is trained based on the effective image.
  • S30 The target face image is intercepted from the effective image by using a preset selection frame.
  • the preset selection frame is a user-defined selection frame for capturing an image from the effective image, and the preset selection frame may be a rectangular frame. Since the non-face area other than the area of the face feature and the face feature is included in the effective image in step S20, the face recognition model generally only pays attention to the face feature in the effective image and does not pay attention to the non-face area. If the effective image containing the non-face area is directly trained in the face recognition model, the model training accuracy may be affected. Therefore, the preset selection frame is used to intercept the image containing the face feature from the effective image, that is, the preset is adopted. The marquee intercepts the part of the face feature from the effective image, obtains the target face image and saves it, so as to improve the accuracy of the subsequent face recognition model training based on the target face image.
  • step S30 specifically includes the following steps:
  • S31 The initial face image including the face feature is intercepted from the effective image by using a preset selection frame.
  • the initial face image is an image obtained by intercepting the effective image.
  • the preset selection frame is used to select the part of the face feature in the effective image, and a screenshot operation is performed to intercept the initial face image including the face feature.
  • the position of the face feature to be intercepted is determined, and then the screenshot operation is performed to obtain Initial face image.
  • the initial face image acquired in step S31 is directly used as the training data of the face recognition model, when the initial face image pixel is low, the training accuracy and efficiency of the face recognition model may be affected.
  • the actual pixel value of the initial face image needs to be determined during the image acquisition process to determine whether the initial face image can be used as training data for the face recognition model training.
  • the RBG value in the initial face image may be calculated by using Matlab or OpenCV to obtain the actual pixel value of the initial face image.
  • the preset pixel value is a pixel value required as a face recognition model training image, and the preset pixel value is a pixel reference value customized by the user according to requirements.
  • the smaller the preset pixel value the smaller the preset pixel is. The more images of the value condition, the lower the accuracy and efficiency of the face recognition model training. Therefore, the preset pixel value needs to be set moderately.
  • the actual pixel value of the initial face image is greater than the preset pixel value, it is determined that the initial face image reaches the pixel value required for the face recognition model training, and the initial face image is output as the target face image. .
  • the actual pixel value of the initial face image is not greater than the preset value, it is determined that the actual pixel value of the initial face image is too low, and if the face recognition model is trained by using the initial face image, the face recognition may be affected.
  • the accuracy and effect of the model training, so the original image corresponding to the initial face image needs to be deleted to save the storage space of the database.
  • the method further includes the following steps: performing scaling processing on the effective image, so that the size of the area of the face feature of the effective image matches the size of the preset selection frame, so that the preset selection frame is adopted in step S31.
  • the initial face image with the appropriate size can be intercepted, which is beneficial to improve the accuracy of the face recognition model training based on the acquired target face image.
  • the image crawler tool is used to download a large number of original images from the network, and the data collection speed is fast;
  • the face recognition algorithm is used to identify the original image to obtain the face feature.
  • the effective image can prevent the original image without the face feature from being used as the effective image, and ensure that the collected effective image can be applied to the face recognition model training, and improve the effectiveness and accuracy of the face recognition model training;
  • the marquee obtains the target face image from the effective image, so that the acquired target face image is applied to the face recognition model training, which can effectively improve the accuracy of the face recognition model.
  • FIG. 6 shows a face image data collecting device corresponding to the face image data collecting method shown in the first embodiment.
  • the face image data collecting device includes an original image crawling module 10, an effective image recognition module 20, and an effective image capturing module 30.
  • the implementation functions of the original image crawling module 10, the effective image recognition module 20, and the effective image capture module 30 are in one-to-one correspondence with the corresponding steps in the first embodiment. To avoid redundancy, the present embodiment will not be described in detail.
  • the original image crawling module 10 is configured to use the image crawler tool to crawl the original image from the network.
  • the effective image recognition module 20 is configured to recognize the original image by using a face recognition algorithm, and obtain an effective image including the face feature.
  • the effective image intercepting module 30 is configured to capture a target facial image from the effective image by using a preset selection frame.
  • the original image crawling module 10 includes a webpage address crawling unit 11, a webpage address storage unit 12, and a picture downloading unit 13.
  • the webpage address crawling unit 11 is configured to use a web crawler to crawl a webpage address of the original image from the network.
  • the webpage address storage unit 12 is configured to store the webpage address in the message queue to be downloaded.
  • the image downloading unit 13 is configured to use the image downloading tool to crawl the original image from the webpage corresponding to the webpage address in the message queue to be downloaded.
  • the effective image recognition module 20 includes a facial features determining unit 211, a facial features integrity determining unit 212, and a first image acquiring unit 213.
  • the facial features determining unit 211 is configured to identify the original image by using a face recognition algorithm, and determine whether there is a facial feature in the original image.
  • the facial detail integrity determining unit 212 is configured to acquire the facial features of the original image, and determine whether the integrity of the facial features existing in the original image reaches the preset integrity.
  • the first image obtaining unit 213 is configured to use the original image as a valid image including the facial features when the facial integrity is up to the preset integrity.
  • the effective image recognition module 20 includes a face region recognition unit 221, an image ratio determination unit 222, and a second image acquisition unit 223.
  • the face region identifying unit 221 is configured to identify the original image by using a face recognition algorithm, and determine whether a face region exists in the original image.
  • the image ratio determining unit 222 is configured to calculate a face image ratio value when the original image has a face region, and determine whether the face image ratio value is greater than a preset ratio value.
  • the second image obtaining unit 223 is configured to use the original image as the effective image including the facial features when the face image ratio is greater than the preset ratio.
  • the effective image capture module 30 includes an initial image capture unit 31, an image pixel acquisition unit 32, an image pixel determination unit 33, and a target image acquisition unit 34.
  • the initial image capturing unit 31 is configured to capture an initial face image including a facial feature from the effective image by using a preset selection frame.
  • the image pixel acquiring unit 32 is configured to acquire an actual pixel value of the initial face image.
  • the image pixel determining unit 33 is configured to determine whether the actual pixel value is greater than a preset pixel value.
  • the target image acquiring unit 34 is configured to use the initial face image as the target face image when the actual pixel value is greater than the preset pixel value.
  • the original image crawling module 10 uses the image crawler tool to crawl the original image from the network, and can automatically capture the original image from the network according to certain rules, without using the original image.
  • the camera or camera captures images or video streams containing faces, which improves image acquisition efficiency and reduces costs.
  • the effective image recognition module 20 uses the face recognition algorithm to identify the original image and obtain an effective image including the face feature. The algorithm can automatically detect and recognize the face in the image, and then perform face feature confirmation on the detected face. .
  • the effective image capture module 30 intercepts the target face image from the effective image by using the preset selection frame, and can clearly and completely intercept the face image.
  • the embodiment provides a computer readable storage medium, where the computer readable storage medium is stored with the computer readable instructions, and the computer readable instructions are executed by the processor to implement the face image data collecting method in Embodiment 1 to avoid Repeat, no longer repeat them here.
  • the computer readable instructions are executed by the processor, the functions of the modules/units in the face image data collecting device in Embodiment 2 are implemented. To avoid repetition, details are not described herein again.
  • Fig. 7 is a schematic diagram of a terminal device in this embodiment.
  • terminal device 70 includes a processor 71, a memory 72, and computer readable instructions 73 stored in memory 72 and operative on processor 71.
  • the processor 71 implements various steps of the face image data collecting method in Embodiment 1 when the computer readable instructions 73 are executed, such as steps S10, S20, and S30 shown in FIG.
  • the processor 71 executes the computer readable instructions 73
  • the functions of the modules/units of the face image data collecting device in Embodiment 2 are implemented, as shown in FIG. 6, the original image crawling module 10, the effective image recognition module 20, and the effective image.
  • the function of the module 30 is intercepted.
  • computer readable instructions 73 may be partitioned into one or more modules/units, one or more modules/units being stored in memory 72 and executed by processor 71 to complete the application.
  • the one or more modules/units may be an instruction segment of a series of computer readable instructions 73 capable of performing a particular function, which is used to describe the execution of computer readable instructions 73 in the terminal device 70.
  • computer readable instructions 73 may be segmented into the original image crawl module 10, the effective image recognition module 20, and the effective image capture module 30 shown in FIG.
  • the original image crawling module 10 is configured to use the image crawler tool to crawl the original image from the network.
  • the effective image recognition module 20 is configured to recognize the original image by using a face recognition algorithm, and obtain an effective image including the face feature.
  • the effective image intercepting module 30 is configured to capture a target facial image from the effective image by using a preset selection frame.
  • the original image crawling module 10 includes a webpage address crawling unit 11, a webpage address storage unit 12, and a picture downloading unit 13.
  • the webpage address crawling unit 11 is configured to use a web crawler to crawl a webpage address of the original image from the network.
  • the webpage address storage unit 12 is configured to store the webpage address in the message queue to be downloaded.
  • the image downloading unit 13 is configured to use the image downloading tool to crawl the original image from the webpage corresponding to the webpage address in the message queue to be downloaded.
  • the effective image recognition module 20 includes a facial features determining unit 211, a facial features integrity determining unit 212, and a first image acquiring unit 213.
  • the facial features determining unit 211 is configured to identify the original image by using a face recognition algorithm, and determine whether there is a facial feature in the original image.
  • the facial detail integrity determining unit 212 is configured to acquire the facial features of the original image, and determine whether the integrity of the facial features existing in the original image reaches the preset integrity.
  • the first image obtaining unit 213 is configured to use the original image as a valid image including the facial features when the facial integrity is up to the preset integrity.
  • the effective image recognition module 20 includes a face region recognition unit 221, an image ratio determination unit 222, and a second image acquisition unit 223.
  • the face region identifying unit 221 is configured to identify the original image by using a face recognition algorithm, and determine whether a face region exists in the original image.
  • the image ratio determining unit 222 is configured to calculate a face image ratio value when the original image has a face region, and determine whether the face image ratio value is greater than a preset ratio value.
  • the second image obtaining unit 223 is configured to use the original image as the effective image including the facial features when the face image ratio is greater than the preset ratio.
  • the effective image capture module 30 includes an initial image capture unit 31, an image pixel acquisition unit 32, an image pixel determination unit 33, and a target image acquisition unit 34.
  • the initial image capturing unit 31 is configured to capture an initial face image including a facial feature from the effective image by using a preset selection frame.
  • the image pixel acquiring unit 32 is configured to acquire an actual pixel value of the initial face image.
  • the image pixel determining unit 33 is configured to determine whether the actual pixel value is greater than a preset pixel value.
  • the target image acquiring unit 34 is configured to use the initial face image as the target face image when the actual pixel value is greater than the preset pixel value.
  • the terminal device 70 can be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device may include, but is not limited to, a processor 71, a memory 72. It will be understood by those skilled in the art that FIG. 6 is only an example of the terminal device 70, and does not constitute a limitation of the terminal device 70, and may include more or less components than those illustrated, or combine some components, or different components.
  • the terminal device may further include an input/output device, a network access device, a bus, and the like.
  • the processor 71 may be a central processing unit (CPU), or may be other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc.
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • the memory 72 may be an internal storage unit of the terminal device 70, such as a hard disk or a memory of the terminal device 70.
  • the memory 72 may also be an external storage device of the terminal device 70, such as a plug-in hard disk provided on the terminal device 70, a smart memory card (SMC), a Secure Digital (SD) card, and a flash memory card (Flash). Card) and so on.
  • the memory 72 may also include both an internal storage unit of the terminal device 70 and an external storage device.
  • the memory 72 is used to store computer readable instructions 73 and other programs and data required by the terminal device.
  • the memory 72 can also be used to temporarily store data that has been or will be output.
  • each functional unit and module in the foregoing system may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit, and the integrated unit may be implemented by hardware.
  • Formal implementation can also be implemented in the form of software functional units.
  • the specific names of the respective functional units and modules are only for the purpose of facilitating mutual differentiation, and are not intended to limit the scope of protection of the present application.
  • the disclosed device/terminal device and method may be implemented in other manners.
  • the device/terminal device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be another division manner for example, multiple units.
  • components may be combined or integrated into another system, or some features may be omitted or not performed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated modules/units if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium.
  • the present application implements all or part of the processes in the foregoing embodiments, and may also be implemented by computer readable instructions, which may be stored in a computer readable storage medium.
  • the computer readable instructions when executed by a processor, may implement the steps of the various method embodiments described above.
  • the computer readable instructions comprise computer readable instruction code, which may be in the form of source code, an object code form, an executable file or some intermediate form or the like.
  • the computer readable medium can include any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash drive, a removable hard drive, a magnetic disk, an optical disk, a computer memory, a read only memory (ROM, Read-Only) Memory), random access memory (RAM), electrical carrier signals, telecommunications signals, and software distribution media.
  • a recording medium a USB flash drive
  • a removable hard drive a magnetic disk, an optical disk
  • a computer memory a read only memory (ROM, Read-Only) Memory
  • RAM random access memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

本申请公开了一种人脸图像数据采集方法、装置、终端设备及存储介质。该人脸图像数据采集方法包括:采用图片爬虫工具从网络中爬取原始图像;采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;采用预设选取框从所述有效图像中截取目标人脸图像。该人脸图像数据采集方法可快速采集到大量的人脸图像。

Description

人脸图像数据采集方法、装置、终端设备及存储介质
本专利申请以2017年8月17日提交的申请号为201710706509.1,名称为“人脸图像数据采集方法、装置、终端设备及存储介质”的中国发明专利申请为基础,并要求其优先权。
技术领域
本申请涉及图像处理技术领域,尤其涉及一种人脸图像数据采集方法、装置、终端设备及存储介质。
背景技术
人脸识别技术是基于人的脸部特征信息进行身份识别的一种生物识别技术。人脸识别技术具体是采用摄像机或摄像头采集含有人脸的图像或视频流,采用人脸识别模型自动在图像或视频流中检测人脸,进而对检测到的人脸进行脸部识别的技术。随着人脸识别技术的发展与普及,需采集大量的人脸图像数据来训练人脸识别模型,以提高人脸识别模型识别人脸的准确率。当前人脸图像数据采集过程需耗费大量的人力和物力,成本较高且效率较低。
发明内容
本申请提供一种人脸图像数据采集方法、装置、终端设备及存储介质,以解决当前人脸图像数据采集过程效率较低的问题。
第一方面,本申请提供一种人脸图像数据采集方法,包括:
采用图片爬虫工具从网络中爬取原始图像;
采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;
采用预设选取框从所述有效图像中截取目标人脸图像。
第二方面,本申请提供一种人脸图像数据采集装置,包括:
原始图像爬取模块,用于采用图片爬虫工具从网络中爬取原始图像。
有效图像识别模块,用于采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像。
有效图像截取模块,用于采用预设选取框从所述有效图像中截取目标人脸图像。
第三方面,本申请提供一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
采用图片爬虫工具从网络中爬取原始图像;
采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;
采用预设选取框从所述有效图像中截取目标人脸图像。
第四方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下步骤:
采用图片爬虫工具从网络中爬取原始图像;
采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;
采用预设选取框从所述有效图像中截取目标人脸图像。
本申请与现有技术相比具有如下优点:本申请所提供的人脸图像数据采集方法、装置、终端设备及存储介质中,通过采用图片爬虫工具从网络中爬取原始图像,可以按照一定的规则自动的从网络中抓取海量的原始图像,数据采集速度快。再采用人脸识别算法对原始图像进行识别以获取包含人脸特征的有效图像,可使没有包含人脸特征的原始图像不作为有效图像,保证采集到的有效图像可应用于人脸识别模型训练,提高人脸识别模型训练的有效性和准确率。再采用预设选取框从有效图像中截取目标人脸图像,以使采集到的目标人脸图像应用在人脸识别模型训练时,可有效提高人脸识别模型的准确率。
附图说明
为了更清楚地说明本申请中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是实施例1中人脸图像数据采集方法的一流程图。
图2是图1中步骤S10的一具体流程图。
图3是图1中步骤S20的一具体流程图。
图4是图1中步骤S20的另一具体流程图。
图5是图1中步骤S30的一具体流程图。
图6是实施例2中人脸图像数据采集装置的一原理框图。
图7是实施例4中终端设备的一原理框图。
具体实施方式
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。
实施例1
图1示出本实施例中人脸图像数据采集方法。该人脸图像数据采集方法可快速从网络中采集到大量的人脸图像数据,以便基于采集到的人脸图像数据进行人脸识别模型训练。如图1所示,该人脸图像数据采集方法包括如下步骤:
S10:采用图片爬虫工具从网络中爬取原始图像。
其中,图片爬虫工具是一种可自动爬取包含图片的网页的网页地址,并基于爬取到的网页地址进行图片下载的程序。该图片爬虫工具只爬取网络中的图片而不爬取其他数据,针对性较强,有利于提高图像采集效率。原始图像是采用图片爬虫工具从网络中下载到的图像。本实施例中,采用图片爬虫工具可从社交网站、搜索引擎或其他网站下载海量的原始图像,数据量大且获取过程简单方便。
具体地,图片爬虫工具包括网络爬虫和图片下载工具,该网络爬虫和图片下载工具可以集成一整体,也可以单独设置。其中,网络爬虫是一种按照一定的规则,自动地抓取互联网信息的程序或者脚本。图片下载工具是一种基于输入的网页地址自动从互联网下载图片的程序或者脚本。本实施例中,图片爬虫工具可采用分布式图片爬虫工具,如python图片爬虫工具,可实现并行抓取原始图像,提高原始图像的爬取效率。其中,python图片爬虫工具集成有网络爬虫和图片下载工具。
在一具体实施方式中,如图2所示,步骤S10具体包括如下步骤:
S11:采用网络爬虫从网络中爬取原始图像的网页地址。
其中,网页地址(Uniform/Universal Resource Locator,简称URL,又称统一资源定位符)是因特网上标准的资源的地址(Address),该网页地址是原始图像所在网页的网页地址。本实施例中,网络爬虫根据用户设置的爬虫任务自动从互联网上爬取包含原始图像的网页地址,无需人工搜索,有利于提高数据采集效率。
进一步地,采用网络爬虫从网络中爬取原始图像的网页地址具体包括如下步骤:
首先,在网络爬虫中配置爬虫任务,该爬虫任务包含原始网页地址、分页规则和关键词。其中,原始网页地址是用户自定义的开始执行该爬虫任务的网页地址。分页规则是用户自定义的用于对网页进行分页的规则,可以根据数据实际来源进行设定,其设置过程中可采用固定格式,也可以采用不固定格式。关键词是网络爬虫在爬取网络地址过程中进行搜索的词语。该关键词可以是用户在对历史数据进行聚类后获取到的词语,使得基于该关键词搜索获取到的有效图像的概率较高,如采用关键词为“自拍”,获取到的包含人脸特征的有效图像的概率较高。
其次,使网络爬虫执行爬虫任务,从原始网页地址开始基于分页规则和关键词抓取包含原始图像的网页地址。本实施例中,可采用预设搜索策略不断从当前页面上爬取新的网页地址放入待下载消息队列中,直到预设停止条件满足时停止执行爬虫任务。预设搜索策略包括但不限于本实施例中采用的广度优先搜索策略或深度优先搜索策略。
S12:将网页地址存储在待下载消息队列中。
具体地,将步骤S11中爬取到的每一原始图像的网页地址依据爬取到的时间先后顺序存储在待下载消息队列中,以便在执行步骤S13时,可基于待下载消息队列中的网页地址进行图片下载。待下载消息队列依据先进先入的方式对网页地址进行处理,可使爬取网页地址和基于网页地址下载原始图像异步处理,有利于提高获取原始图像的效率。
S13:采用图片下载工具从待下载消息队列中的网页地址对应的网页爬取原始图像。
其中,图片下载工具是一种批量下载图片的工具,可根据输入的网页地址自动下载该网页地址对应的页面中所有的图片。该图片下载工具可以集成在图片爬虫工具上,如python图片网络爬虫中集成有图片下载工具;也可以是独立的图片下载工具,如NeoDownloader工具,可快速批量下载图片。
本实施例中,待下载消息队列中存储有多个包含原始图像的网页地址,图片下载工具依序从待下载消息队列中逐一获取网页地址并下载该网页地址对应的原始图像。具体地,图片下载工具从待下载消息队列的队头获取到一网页地址并对该网页地址中的图片进行下载,将下载的原始图像存储在数据库后,注销待下载消息队列中相应的网页地址,重复上述步骤直至待下载消息队列中不存在网页地址,以获取图片爬虫工具爬取的所有网页地址对应的原始图像。
本实施例中,将网络爬虫爬取到的原始图像的网页地址存储在待下载消息队列中,再采用图片下载工具基于待下载消息队列中获取的网页地址下载原始图像,使得网页地址下 载和原始图像下载异步处理,有利于提高原始图像的获取效率。
S20:采用人脸识别算法对原始图像进行识别,获取包含人脸特征的有效图像。
由于采用网络爬虫从网络下载的原始图像可以是包含人脸特征的图像,也可以是不包含人脸特征的图像;若直接基于采集到的原始图像进行人脸识别模型训练时,不包含人脸特征的原始图像会影响人脸识别模型训练的准确性和效率,因此,需采用人脸识别算法对原始图像进行识别,以提取出包含人脸特征的有效图像,以使基于该有效图像进行人脸识别模型训练,可提高人脸识别模型训练的准确性和效率。其中,人脸识别算法是用于识别图像中人脸特征的算法。本实施例中,预先设置人脸识别程序,该人脸识别程序中存储有人脸识别算法,在人脸识别程序被处理器执行时,可采用该人脸识别算法对原始图像进行人脸识别,以获取包含人脸特征的有效图像。
具体地,将采用图片爬虫工具从网络下载的原始图像缓存在数据库中,并将原始图像在数据库中的存储地址放入待识别消息队列中,执行人脸识别程序,依序基于待识别消息队列中的存储地址获取对应的原始图像,采用人脸识别算法对该原始图像进行识别,以确定该原始图像是否包含人脸特征,若该原始图像包含人脸特征,则认定该原始图像为有效图像,保存该有效图像;若该原始图像不包含人脸特征,则认定该原始图像不为有效图像,删除数据库中缓存的原始图像,以节省数据库的存储空间。
本实施例中,人脸识别算法可以是基于几何特征的人脸识别算法、基于特征脸(Eigenface)的人脸识别算法、基于弹性模型的人脸识别算法、基于神经网络(Neural Networks)的人脸识别算法等。其中,基于几何特征的人脸识别算法是通过提取眼、耳、口、鼻、眉等器官的几何特征作为分类特征进行人脸识别的方法。基于特征脸(Eigenface)的人脸识别算法是根据一组人脸训练图像构造主元子空间,识别时将原始图像投影到主元子空间上,得到一组投影系数,利用该投影系数和各个已经人脸图像进行比较,以识别出人脸特征的方法;由于主元具有脸的形状,被称为特征脸(Eigenface)。基于弹性模型的人脸识别算法是将物体用稀疏图形来描述,其顶点表示局部能量的多尺度描述,边表示拓扑连接关系并用几何距离来标记,然后用应用塑性图形匹配技术来寻找最接近的已知图形的方法。基于神经网络(Neural Networks)的人脸识别算法是一种非线性动力学系统,包括提取多个主元,再采用自相关神经网络映射到多维空间,采用多层感知器进行判断,以识别出人脸的方法,该方法具有良好的自组织、自适应能力。
在一种具体实施方式中,如图3所示,步骤S20具体包括如下步骤:
S211:采用人脸识别算法对原始图像进行识别,判断原始图像中是否存在五官特征。
其中,五官特征是人脸特征的一种,包含眼、耳、口、鼻、眉这五个器官的面部特征。本实施例中,可以采用基于特征脸(Eigenface)的人脸识别算法对原始图像进行识别,以确定原始图像中是否存在五官特征,具体包括如下步骤:首先,采用主动外观模型(Active Appearance Model,以下简称AAM)侦测出原始图像中的五官影像及五官的特征向量。其次,采用主成分分析(Principal Component Analysis,以下简称PCA)分别抽取五官的特征向量同时降低数据维度,提取主要的特征向量,采用PCA降维可降低运算负担。然后,采用K-means方法对PCA处理后的特征向量进行分类,可实现简单快速分类。最后,采用支持向量机(Support Vector Machine,以下简称SVM)将这K个类别的数据训练出一个分类模型,以便基于该分类模型识别出原始图像是否包含五官特征。
本实施例中,采用人脸识别算法对原始图像进行识别,若识别出原始图像中存在五官特征,则执行步骤S212;若识别出原始图像中不存在五官特征,则该原始图像不是包含人脸特征的有效图像,删除该原始图像,以节省数据库的存储空间。
S212:若原始图像中存在五官特征,则获取原始图像的五官完整度,判断五官完整度是否达到预设完整度。
其中,五官完整度是原始图像中识别出的五官特征相比于完整的五官特征的比值。该五官完整度=∑器官权重*器官完整度,器官包括眼、耳、口、鼻、眉五个器官。其中,器官完整度是指眼、耳、口、鼻、眉这五个器官的完整度,器官完整度是原始图像中识别出的器官特征相比于完整的器官特征的比值。器官权重是用户预定义的权重常数,器官权重可根据器官距离人脸中心的距离的远近设置。如本实施例中,鼻距离人脸中心的距离最近,则鼻的权重最大;相应地,耳距离人脸中心的距离最远,则耳的权重最小。若任一器官在原始图像中全部显示,则其器官完整度为100%,若任一器官在原始图像中只显示一半,则其器官完整度为50%。预设完整度是用户预先设置的用于评价五官特征完整的参考值,该预设完整度由用户自定义,可设为80%或其他数值。
S213:若五官完整度达到预设完整度,则将原始图像作为包含人脸特征的有效图像。
可以理解地,若原始图像的五官完整度达到预设完整度,则认为原始图像中五官特征完整,即该原始图像中包含较完整的人脸特征,可以作为有效图像保存。反之,若原始图像的五官完整度未达到预设完整度,则认为原始图像中的五官特征不完整,该原始图像无法作为人脸识别模型的训练数据,删除该原始图像,以节省数据库的存储空间。本实施例中,若原始图像中存在完整的人脸,则其五官完整度为100%,达到预设完整度80%,则将该原始图像作为有效图像保存;若原始图像中只有一半的人脸,则其五官完整度为50%, 未达到预设完整度80%,认定该原始图像不是有效图像,删除该原始图像,以节省数据库的存储空间。
在另一种具体实施方式中,如图4所示,步骤S20具体包括如下步骤:
S221:采用人脸识别算法对原始图像进行识别,判断原始图像中是否存在人脸区域。
其中,人脸区域是人的颈部以上的面部特征,人脸区域不仅包含眼、耳、口、鼻、眉等五个器官,还可以包括人脸的肤色、人脸的表情等特征。本实施例中,同样可采用基于几何特征的人脸识别算法、基于特征脸(Eigenface)的人脸识别算法、基于弹性模型的人脸识别算法、基于神经网络(Neural Networks)的人脸识别算法等人脸识别算法对人脸区域进行识别。
具体地,采用BP神经网络(Back Propagationr Neural Networks,即反向传输神经网络,以下简称BP神经网络)的人脸识别算法对原始图像进行识别。BP神经网络是一种前向网络,一般包括输入层、隐藏层和输出层。隐藏层可以是一层、二层甚至更多层,以便于分析各因素间的相互作用,每一层由若干神经元组成,相邻两层的每一个神经元之间均有权值联系,权值的大小反映了这两个神经元之间的连接强度,整个网络的计算过程是由输入层到隐藏层再到输出层单向进行。BP神经网络本质上是一种输入到输出的映射,通过学习大量的输入和输出之间的映射关系。采用BP神经网络的人脸识别算法对原始图像进行识别的过程具体包括如下步骤:(1)对原始图像进行图像压缩、图像抽样和输入矢量标准化等预处理,获取图像特征。其中,图像压缩是采用近邻插值、双线性插值或双立方插值等插值算法对原始图像进行压缩,以避免原始图像中大量的冗余信息导致BP神经网络结构过于复杂。图像抽样是将压缩后的二维图像矩阵一行接一行拼成一维列矢量,以便于后续BP神经网络的输入。输入矢量标准化是将图像抽样获取的一维列矢量进行标准化处理,以避免一维列矢量数值较大,影响计算效率和收敛率。(2)将获取的图像特征输入BP神经网络的输入层,通过隐藏层的处理,在输出层输出原始图像中包含人脸区域的概率。(3)将获取到的原始图像中包含人脸区域的概率与预设概率进行比较;若原始图像中包含人脸区域的概率大于预设概率,则认为原始图像中存在人脸区域;反之,若原始图像中包含人脸区域的概率不大于预设概率,则认为原始图像中不存在人脸区域。其中,预设概率是用户自定义的评价原始图像中存在人脸区域的概率。
本实施例中,采用人脸识别算法对原始图像进行识别,若识别出原始图像存在人脸区域,则执行步骤S222;若识别出原始图像中不存在人脸特征,则该原始图像不是包含人脸特征的有效图像,删除该原始图像,以节省数据库的存储空间。
S222:若原始图像存在人脸区域,则计算人脸图像占比值,判断人脸图像占比值是否大于预设占比值。
其中,人脸图像占比值是指人脸区域对应的图像大小与原始图像的图像大小的比值。本实施例中,人脸区域可采用矩形框来限定,人脸区域对应的图像大小为矩形框的面积。相应地,原始图像的图像大小为原始图像的面积。即人脸图像占比值是人脸区域的面积与原始图像的面积的比值。预设占比值是用户预先设置的用于评价原始图像为有效图像的占比值,该预设占比值是一个参考值,可由用户自定义。
S223:若人脸图像占比值大于预设占比值,则将原始图像作为包含人脸特征的有效图像。
可以理解地,若原始图像的人脸图像占比值大于预设占比值,则认定原始图像为包含人脸特征的有效图像。若原始图像的人脸图像占比值不大于预设占比值,说明原始图像中人脸区域的面积过小,若采用该原始图像作为后续人脸识别模型的训练数据,可能影响人脸识别模型训练的准确率和训练效率,因此,不将人脸图像占比值过小的原始图像作为有效图像,删除该原始图像,以节省数据库的存储空间。
本实施例中,可以采用步骤S211-S213,即通过比较五官完整度与预设完整度的方式确定原始图像是否为包含人脸特征的有效图像;也可以采用步骤S221-223,即通过比较人脸图像占比值与预设占比值的方式确定原始图像是否为包含人脸特征的有效图像,这两种判断方式在一定程度上可提高获取到的有效图像进行后续人脸识别模型训练的准确率。可以理解地,还可以将步骤S211-S213和步骤S221-S223结合,即先后完成五官完整度与预设完整度的比较和人脸图像占比值与预设占比值的比较,在两者均满足条件时,才可认定原始图像为有效图像,以便基于有效图像进行人脸识别模型训练时,进一步提高后续人脸识别模型训练的准确率。
S30:采用预设选取框从有效图像中截取目标人脸图像。
其中,预设选取框是用户自定义的用于从有效图像中截取图像的选取框,该预设选取框可以为矩形框。由于步骤S20中获取有效图像中包括人脸特征所在区域和人脸特征以外的非人脸区域,而人脸识别模型训练时一般只关注有效图像中的人脸特征并不关注非人脸区域,若将包含非人脸区域的有效图像直接进行人脸识别模型进行训练,可能影响模型训练准确率,因此,需采用预设选取框从有效图像中截取包含人脸特征的图像,即采用预设选取框从有效图像中截取人脸特征所在的部分,获取目标人脸图像并保存,以提高后续基于目标人脸图像进行人脸识别模型训练的准确率。
在一具体实施方式中,如图5所示,步骤S30具体包括如下步骤:
S31:采用预设选取框从有效图像中截取包含人脸特征的初始人脸图像。
其中,初始人脸图像是对有效图像进行截取后获得的图像。采用预设选取框在有效图像中选取人脸特征所在部分,执行截图操作,以截取包含人脸特征的初始人脸图像。本实施例中,通过捕捉有效图像中人脸特征的中心位置,并使该中心位置与预设选取框的中心重合,以确定所要截取的人脸特征所在的位置,再执行截图操作,以获取初始人脸图像。
S32:获取初始人脸图像的实际像素值。
可以理解地,若直接将步骤S31获取到的初始人脸图像作为人脸识别模型的训练数据,在初始人脸图像像素较低时,可能影响人脸识别模型的训练准确率和效率,因此,需在图像采集过程中确定初始人脸图像的实际像素值,以确定能否将初始人脸图像作为人脸识别模型训练的训练数据。本实施例中,可采用Matlab或OpenCV对初始人脸图像中的RBG值进行计算,以获取初始人脸图像的实际像素值。
S33:判断实际像素值是否大于预设像素值。
预设像素值是作为人脸识别模型训练图像所需的像素值,该预设像素值是用户根据需求自定义的一个像素参考值。预设像素值越大,采集到满足预设像素值条件的图像越少,其进行人脸识别模型训练的准确率和效率越高;反之,预设像素值越小,采集到满足预设像素值条件的图像越多,其进行人脸识别模型训练的准确率和效率越低,因此,该预设像素值需适中设置。
S34:若实际像素值大于预设像素值,则将初始人脸图像作为目标人脸图像。
可以理解地,若初始人脸图像的实际像素值大于预设像素值,则认定该初始人脸图像达到人脸识别模型训练所需的像素值,将该初始人脸图像作为目标人脸图像输出。反之,若初始人脸图像的实际像素值不大于预设值,则认定该初始人脸图像的实际像素值过低,若利用该初始人脸图像进行人脸识别模型训练,会影响人脸识别模型训练的准确率和效果,因此需删除该初始人脸图像对应的原始图像,以节省数据库的存储空间。
进一步地,步骤S31之前还包括如下步骤:对有效图像进行缩放处理,以使有效图像的人脸特征所在区域的大小与预设选取框的大小相匹配,以使步骤S31中采用预设选取框可截取出大小合适的初始人脸图像,有利于提高后续基于获取到目标人脸图像进行人脸识别模型训练的准确率。
本实施例所提供的人脸图像数据采集方法中,采用图片爬虫工具从网络中可下载海量的原始图像,数据采集速度快;采用人脸识别算法对原始图像进行识别以获取包含人脸特 征的有效图像,可使没有包含人脸特征的原始图像不作为有效图像,保证采集到的有效图像可应用于人脸识别模型训练,提高人脸识别模型训练的有效性和准确率;再采用预设选取框从有效图像中获取目标人脸图像,以使采集到的目标人脸图像应用在人脸识别模型训练时,可有效提高人脸识别模型的的准确率。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请的实施过程构成任何限定。
实施例2
对应于实施例1中的人脸图像数据采集方法,图6示出与实施例1所示的人脸图像数据采集方法一一对应的人脸图像数据采集装置。如图6所示,该人脸图像数据采集装置包括原始图像爬取模块10、有效图像识别模块20和有效图像截取模块30。其中,原始图像爬取模块10、有效图像识别模块20和有效图像截取模块30的实现功能与实施例1中对应的步聚一一对应,为避免赘述,本实施例不一一详述。
原始图像爬取模块10,用于采用图片爬虫工具从网络中爬取原始图像。
有效图像识别模块20,用于采用人脸识别算法对原始图像进行识别,获取包含人脸特征的有效图像。
有效图像截取模块30,用于采用预设选取框从有效图像中截取目标人脸图像。
进一步地,原始图像爬取模块10包括网页地址爬取单元11,网页地址存储单元12和图片下载单元13。
网页地址爬取单元11,用于采用网络爬虫从网络中爬取原始图像的网页地址。
网页地址存储单元12,用于将网页地址存储在待下载消息队列中。
图片下载单元13,用于采用图片下载工具从待下载消息队列中的网页地址对应的网页爬取原始图像。
进一步地,有效图像识别模块20包括五官特征判断单元211、五官完整度判断单元212和第一图像获取单元213。
五官特征判断单元211,用于采用人脸识别算法对原始图像进行识别,判断原始图像中是否存在五官特征。
五官完整度判断单元212,用于获取原始图像的五官特征,并判断原始图像中存在的五官特征完整度是否达到预设完整度。
第一图像获取单元213,用于在五官完整度达到预设完整度,则将原始图像作为包含人脸特征的有效图像。
进一步地,有效图像识别模块20包括人脸区域识别单元221、图像占比判断单元222和第二图像获取单元223。
人脸区域识别单元221,用于采用人脸识别算法对原始图像进行识别,判断原始图像中是否存在人脸区域。
图像占比判断单元222,用于原始图像存在人脸区域时,计算人脸图像占比值,判断人脸图像占比值是否大于预设占比值。
第二图像获取单元223,用于在人脸图像占比值大于预设占比值,则将原始图像作为包含人脸特征的有效图像。
进一步地,有效图像截取模块30包括初始图像截取单元31,图像像素获取单元32,图像像素判断单元33和目标图像获取单元34。
初始图像截取单元31,用于采用预设选取框从有效图像中截取包含人脸特征的初始人脸图像。
图像像素获取单元32,用于获取初始人脸图像的实际像素值。
图像像素判断单元33,用于判断实际像素值是否大于预设像素值。
目标图像获取单元34,用于在实际像素值大于预设像素值,则将初始人脸图像作为目标人脸图像。
本实施例所提供的人脸图像数据采集装置中,原始图像爬取模块10采用图片爬虫工具从网络中爬取原始图像,可以按照一定的规则自动的从网络中抓取原始图像,不需要采用摄像机或摄像头采集含有人脸的图像或视频流,提高了图像获取的效率,降低了成本。有效图像识别模块20采用人脸识别算法对原始图像进行识别,获取包含人脸特征的有效图像,该算法可以自动在图像中检测和识别人脸,进而对检测到的人脸进行脸部特征确认。有效图像截取模块30采用预设选取框从有效图像中截取目标人脸图像,可以清晰完整地截取的人脸图像。
实施例3
本实施例提供一计算机可读存储介质,该计算机可读存储介质上存储有计算机可读指令,该计算机可读指令被处理器执行时实现实施例1中的人脸图像数据采集方法,为避免重复,这里不再赘述。或者,该计算机可读指令被处理器执行时实现实施例2中人脸图像数据采集装置中各模块/单元的功能,为避免重复,这里不再赘述。
实施例4
图7是本实施例中终端设备的示意图。如图7所示,终端设备70包括处理器71、存 储器72以及存储在存储器72中并可在处理器71上运行的计算机可读指令73。处理器71执行计算机可读指令73时实现实施例1中人脸图像数据采集方法的各个步骤,例如图1所示的步骤S10、S20和S30。或者,处理器71执行计算机可读指令73时实现实施例2中人脸图像数据采集装置各模块/单元的功能,如图6所示原始图像爬取模块10、有效图像识别模块20和有效图像截取模块30的功能。
示例性的,计算机可读指令73可以被分割成一个或多个模块/单元,一个或者多个模块/单元被存储在存储器72中,并由处理器71执行,以完成本申请。一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令73的指令段,该指令段用于描述计算机可读指令73在终端设备70中的执行过程。例如,计算机可读指令73可以被分割成图6所示的原始图像爬取模块10、有效图像识别模块20和有效图像截取模块30。
原始图像爬取模块10,用于采用图片爬虫工具从网络中爬取原始图像。
有效图像识别模块20,用于采用人脸识别算法对原始图像进行识别,获取包含人脸特征的有效图像。
有效图像截取模块30,用于采用预设选取框从有效图像中截取目标人脸图像。
进一步地,原始图像爬取模块10包括网页地址爬取单元11,网页地址存储单元12和图片下载单元13。
网页地址爬取单元11,用于采用网络爬虫从网络中爬取原始图像的网页地址。
网页地址存储单元12,用于将网页地址存储在待下载消息队列中。
图片下载单元13,用于采用图片下载工具从待下载消息队列中的网页地址对应的网页爬取原始图像。
进一步地,有效图像识别模块20包括五官特征判断单元211、五官完整度判断单元212和第一图像获取单元213。
五官特征判断单元211,用于采用人脸识别算法对原始图像进行识别,判断原始图像中是否存在五官特征。
五官完整度判断单元212,用于获取原始图像的五官特征,并判断原始图像中存在的五官特征完整度是否达到预设完整度。
第一图像获取单元213,用于在五官完整度达到预设完整度,则将原始图像作为包含人脸特征的有效图像。
进一步地,有效图像识别模块20包括人脸区域识别单元221、图像占比判断单元222和第二图像获取单元223。
人脸区域识别单元221,用于采用人脸识别算法对原始图像进行识别,判断原始图像中是否存在人脸区域。
图像占比判断单元222,用于原始图像存在人脸区域时,计算人脸图像占比值,判断人脸图像占比值是否大于预设占比值。
第二图像获取单元223,用于在人脸图像占比值大于预设占比值,则将原始图像作为包含人脸特征的有效图像。
进一步地,有效图像截取模块30包括初始图像截取单元31,图像像素获取单元32,图像像素判断单元33和目标图像获取单元34。
初始图像截取单元31,用于采用预设选取框从有效图像中截取包含人脸特征的初始人脸图像。
图像像素获取单元32,用于获取初始人脸图像的实际像素值。
图像像素判断单元33,用于判断实际像素值是否大于预设像素值。
目标图像获取单元34,用于在实际像素值大于预设像素值,则将初始人脸图像作为目标人脸图像。
该终端设备70可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。终端设备可包括,但不仅限于,处理器71、存储器72。本领域技术人员可以理解,图6仅仅是终端设备70的示例,并不构成对终端设备70的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如终端设备还可以包括输入输出设备、网络接入设备、总线等。
所称处理器71可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器72可以是终端设备70的内部存储单元,例如终端设备70的硬盘或内存。存储器72也可以是终端设备70的外部存储设备,例如终端设备70上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器72还可以既包括终端设备70的内部存储单元也包括外部存储设备。存储器72用于存储计算机可读指令73以及终端设备所需的其他程序和 数据。存储器72还可以用于暂时地存储已经输出或者将要输出的数据。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用 时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读存储介质中,该计算机可读指令在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机可读指令包括计算机可读指令代码,所述计算机可读指令代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机可读指令代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括是电载波信号和电信信号。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种人脸图像数据采集方法,其特征在于,包括:
    采用图片爬虫工具从网络中爬取原始图像;
    采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;
    采用预设选取框从所述有效图像中截取目标人脸图像。
  2. 根据权利要求1所述的人脸图像数据采集方法,其特征在于,所述采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像,包括:
    采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在五官特征;
    若所述原始图像中存在五官特征,则获取所述原始图像的五官完整度,判断所述五官完整度是否达到预设完整度;
    若所述五官完整度达到所述预设完整度,则将所述原始图像作为所述包含人脸特征的有效图像。
  3. 根据权利要求1或2所述的人脸图像数据采集方法,其特征在于,所述采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像,包括:
    采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在人脸区域;
    若所述原始图像存在人脸区域,则计算人脸图像占比值,判断所述人脸图像占比值是否大于预设占比值;
    若所述人脸图像占比值大于所述预设占比值,则将所述原始图像作为所述包含人脸特征的有效图像。
  4. 根据权利要求1所述的人脸图像数据采集方法,其特征在于,所述采用预设选取框从所述有效图像中截取目标人脸图像,包括:
    采用预设选取框从所述有效图像中截取包含所述人脸特征的初始人脸图像;
    获取所述初始人脸图像的实际像素值;
    判断所述实际像素值是否大于预设像素值;
    若所述实际像素值大于所述预设像素值,则将所述初始人脸图像作为所述目标人脸图像。
  5. 根据权利要求1所述的人脸图像数据采集方法,其特征在于,所述采用图片爬虫工 具从网络中爬取原始图像,包括:
    采用网络爬虫从网络中爬取所述原始图像的网页地址;
    将所述网页地址存储在待下载消息队列中;
    采用图片下载工具从所述待下载消息队列中的网页地址对应的网页爬取所述原始图像。
  6. 一种人脸图像数据采集装置,其特征在于,包括:
    原始图像爬取模块,用于采用图片爬虫工具从网络中爬取原始图像;
    有效图像识别模块,用于采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;
    有效图像截取模块,用于采用预设选取框从所述有效图像中截取目标人脸图像。
  7. 根据权利要求6所述的人脸图像数据采集装置,其特征在于,所述有效图像识别模块包括:
    五官特征判断单元,用于采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在五官特征;
    五官完整度判断单元,用于获取所述原始图像的五官特征,并判断原始图像中存在的五官特征完整度是否达到预设完整度;
    第一图像获取单元,用于在五官完整度达到所述预设完整度,则将所述原始图像作为所述包含人脸特征的有效图像。
  8. 根据权利要求6或7所述的人脸图像数据采集装置,其特征在于,所述有效图像识别模块包括:
    人脸区域识别单元,用于采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在人脸区域;
    图像占比判断单元,用于所述原始图像存在人脸区域时,计算人脸图像占比值,判断所述人脸图像占比值是否大于预设占比值;
    第二图像获取单元,用于在所述人脸图像占比值大于所述预设占比值,则将所述原始图像作为所述包含人脸特征的有效图像。
  9. 根据权利要求6所述的人脸图像数据采集装置,其特征在于,所述原始图像爬取模块包括:
    网页地址爬取单元,用于采用网络爬虫从网络中爬取所述原始图像的网页地址;
    网页地址存储单元,用于将所述网页地址存储在待下载消息队列中;
    图片下载单元,用于采用图片下载工具从所述待下载消息队列中的网页地址对应的网页爬取所述原始图像。
  10. 根据权利要求6所述的人脸图像数据采集装置,其特征在于,所述有效图像截取模块包括:
    初始图像截取单元,用于采用预设选取框从所述有效图像中截取包含所述人脸特征的初始人脸图像;
    图像像素获取单元,用于获取所述初始人脸图像的实际像素值;
    图像像素判断单元,用于判断所述实际像素值是否大于预设像素值;
    目标图像获取单元,用于在所述实际像素值大于所述预设像素值,则将所述初始人脸图像作为所述目标人脸图像。
  11. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    采用图片爬虫工具从网络中爬取原始图像;
    采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;
    采用预设选取框从所述有效图像中截取目标人脸图像。
  12. 如权利要求11所述的终端设备,其特征在于,所述采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像,包括:
    采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在五官特征;
    若所述原始图像中存在五官特征,则获取所述原始图像的五官完整度,判断所述五官完整度是否达到预设完整度;
    若所述五官完整度达到所述预设完整度,则将所述原始图像作为所述包含人脸特征的有效图像。
  13. 如权利要求11或12所述的终端设备,其特征在于,所述采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像,包括:
    采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在人脸区域;
    若所述原始图像存在人脸区域,则计算人脸图像占比值,判断所述人脸图像占比值是否大于预设占比值;
    若所述人脸图像占比值大于所述预设占比值,则将所述原始图像作为所述包含人脸特征的有效图像。
  14. 如权利要求11所述的终端设备,其特征在于,所述采用预设选取框从所述有效图像中截取目标人脸图像,包括:
    采用预设选取框从所述有效图像中截取包含所述人脸特征的初始人脸图像;
    获取所述初始人脸图像的实际像素值;
    判断所述实际像素值是否大于预设像素值;
    若所述实际像素值大于所述预设像素值,则将所述初始人脸图像作为所述目标人脸图像。
  15. 如权利要求11所述的终端设备,其特征在于,所述采用图片爬虫工具从网络中爬取原始图像,包括:
    采用网络爬虫从网络中爬取所述原始图像的网页地址;
    将所述网页地址存储在待下载消息队列中;
    采用图片下载工具从所述待下载消息队列中的网页地址对应的网页爬取所述原始图像。
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:
    采用图片爬虫工具从网络中爬取原始图像;
    采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像;
    采用预设选取框从所述有效图像中截取目标人脸图像。
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,所述采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像,包括:
    采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在五官特征;
    若所述原始图像中存在五官特征,则获取所述原始图像的五官完整度,判断所述五官完整度是否达到预设完整度;
    若所述五官完整度达到所述预设完整度,则将所述原始图像作为所述包含人脸特征的有效图像。
  18. 如权利要求16或17所述的计算机可读存储介质,其特征在于,所述采用人脸识别算法对所述原始图像进行识别,获取包含人脸特征的有效图像,包括:
    采用人脸识别算法对所述原始图像进行识别,判断所述原始图像中是否存在人脸区域;
    若所述原始图像存在人脸区域,则计算人脸图像占比值,判断所述人脸图像占比值是否大于预设占比值;
    若所述人脸图像占比值大于所述预设占比值,则将所述原始图像作为所述包含人脸特征的有效图像。
  19. 如权利要求16所述的计算机可读存储介质,其特征在于,所述采用预设选取框从所述有效图像中截取目标人脸图像,包括:
    采用预设选取框从所述有效图像中截取包含所述人脸特征的初始人脸图像;
    获取所述初始人脸图像的实际像素值;
    判断所述实际像素值是否大于预设像素值;
    若所述实际像素值大于所述预设像素值,则将所述初始人脸图像作为所述目标人脸图像。
  20. 如权利要求16所述的计算机可读存储介质,其特征在于,所述采用图片爬虫工具从网络中爬取原始图像,包括:
    采用网络爬虫从网络中爬取所述原始图像的网页地址;
    将所述网页地址存储在待下载消息队列中;
    采用图片下载工具从所述待下载消息队列中的网页地址对应的网页爬取所述原始图像。
PCT/CN2018/074575 2017-08-17 2018-01-30 人脸图像数据采集方法、装置、终端设备及存储介质 WO2019033715A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
SG11201809210VA SG11201809210VA (en) 2017-08-17 2018-01-30 Face image data collection method, apparatus, terminal device and storage medium
US16/088,828 US20200387748A1 (en) 2017-08-17 2018-01-30 Facial image data collection method, apparatus, terminal device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710706509.1 2017-08-17
CN201710706509.1A CN107679546A (zh) 2017-08-17 2017-08-17 人脸图像数据采集方法、装置、终端设备及存储介质

Publications (1)

Publication Number Publication Date
WO2019033715A1 true WO2019033715A1 (zh) 2019-02-21

Family

ID=61135091

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/074575 WO2019033715A1 (zh) 2017-08-17 2018-01-30 人脸图像数据采集方法、装置、终端设备及存储介质

Country Status (4)

Country Link
US (1) US20200387748A1 (zh)
CN (1) CN107679546A (zh)
SG (1) SG11201809210VA (zh)
WO (1) WO2019033715A1 (zh)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109063542A (zh) * 2018-06-11 2018-12-21 平安科技(深圳)有限公司 图片识别方法、装置、计算机设备及存储介质
CN108875654B (zh) * 2018-06-25 2021-03-05 深圳云天励飞技术有限公司 一种人脸特征采集方法及装置
CN109063784B (zh) * 2018-08-23 2021-03-05 深圳码隆科技有限公司 一种人物服饰图像数据筛选方法及其装置
CN109255319A (zh) * 2018-09-02 2019-01-22 珠海横琴现联盛科技发展有限公司 针对静态照片的人脸识别支付信息防伪方法
CN109597833A (zh) * 2018-10-15 2019-04-09 平安科技(深圳)有限公司 基于大数据的事件预测方法、装置、计算机设备及存储介质
CN109727350A (zh) * 2018-12-14 2019-05-07 深圳壹账通智能科技有限公司 一种基于人脸识别的门禁控制方法及装置
US10990807B2 (en) * 2019-09-06 2021-04-27 Adobe, Inc. Selecting representative recent digital portraits as cover images
CN110825808A (zh) * 2019-09-23 2020-02-21 重庆特斯联智慧科技股份有限公司 一种基于边缘计算的分布化人脸数据库系统及其生成方法
US11552914B2 (en) * 2019-10-06 2023-01-10 International Business Machines Corporation Filtering group messages
CN110909609A (zh) * 2019-10-26 2020-03-24 湖北讯獒信息工程有限公司 基于人工智能的表情识别方法
CN111563416A (zh) * 2020-04-08 2020-08-21 安徽舒州农业科技有限责任公司 一种基于插秧机的自动转向方法及系统
CN111680202B (zh) * 2020-04-24 2022-04-26 烽火通信科技股份有限公司 一种基于本体的人脸图像数据收集方法和装置
CN112085701B (zh) * 2020-08-05 2024-06-11 深圳市优必选科技股份有限公司 一种人脸模糊度检测方法、装置、终端设备及存储介质
CN112037373A (zh) * 2020-08-10 2020-12-04 国网上海市电力公司 一种基于人脸识别的五防安全辅助装置
CN112202865A (zh) * 2020-09-25 2021-01-08 北京微步在线科技有限公司 一种图片下载方法及装置
CN112132074A (zh) * 2020-09-28 2020-12-25 平安养老保险股份有限公司 人脸图像验证方法、装置、计算机设备及存储介质
US20220358333A1 (en) * 2021-05-07 2022-11-10 Ford Global Technologies, Llc Automatic annotation using ground truth data for machine learning models

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778481A (zh) * 2014-12-19 2015-07-15 五邑大学 一种大规模人脸模式分析样本库的构建方法和装置
CN106815557A (zh) * 2016-12-20 2017-06-09 北京奇虎科技有限公司 一种人脸面部特征的评价方法、装置以及移动终端

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103618944A (zh) * 2013-11-27 2014-03-05 乐视网信息技术(北京)股份有限公司 一种视频控制方法及用户终端

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778481A (zh) * 2014-12-19 2015-07-15 五邑大学 一种大规模人脸模式分析样本库的构建方法和装置
CN106815557A (zh) * 2016-12-20 2017-06-09 北京奇虎科技有限公司 一种人脸面部特征的评价方法、装置以及移动终端

Also Published As

Publication number Publication date
SG11201809210VA (en) 2019-03-28
US20200387748A1 (en) 2020-12-10
CN107679546A (zh) 2018-02-09

Similar Documents

Publication Publication Date Title
WO2019033715A1 (zh) 人脸图像数据采集方法、装置、终端设备及存储介质
US11151363B2 (en) Expression recognition method, apparatus, electronic device, and storage medium
US20230087526A1 (en) Neural network training method, image classification system, and related device
WO2021218060A1 (zh) 基于深度学习的人脸识别方法及装置
WO2022033150A1 (zh) 图像识别方法、装置、电子设备及存储介质
WO2018188453A1 (zh) 人脸区域的确定方法、存储介质、计算机设备
CA2934514C (en) System and method for identifying faces in unconstrained media
WO2019119505A1 (zh) 人脸识别的方法和装置、计算机装置及存储介质
CN110569756B (zh) 人脸识别模型构建方法、识别方法、设备和存储介质
WO2021139324A1 (zh) 图像识别方法、装置、计算机可读存储介质及电子设备
WO2019033525A1 (zh) Au特征识别方法、装置及存储介质
WO2021051611A1 (zh) 基于人脸可见性的人脸识别方法、系统、装置及存储介质
Choi et al. Incremental face recognition for large-scale social network services
WO2018176954A1 (zh) 一种用于提供交友对象的方法、设备及系统
WO2022105118A1 (zh) 基于图像的健康状态识别方法、装置、设备及存储介质
Dong et al. Comparison of random forest, random ferns and support vector machine for eye state classification
WO2021104125A1 (zh) 禽蛋异常的识别方法、装置及系统、存储介质、电子装置
WO2021051547A1 (zh) 暴力行为检测方法及系统
CN109033935B (zh) 抬头纹检测方法及装置
WO2021238586A1 (zh) 一种训练方法、装置、设备以及计算机可读存储介质
WO2018176953A1 (zh) 一种用于提供交友对象的方法、设备及系统
WO2020037962A1 (zh) 面部图像校正方法、装置及存储介质
Radman et al. Robust face pseudo-sketch synthesis and recognition using morphological-arithmetic operations and HOG-PCA
CN113298158A (zh) 数据检测方法、装置、设备及存储介质
Luo et al. The iBUG eye segmentation dataset

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18845776

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23/09/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18845776

Country of ref document: EP

Kind code of ref document: A1