CN106844412A - A kind of human face data collection method and device - Google Patents
A kind of human face data collection method and device Download PDFInfo
- Publication number
- CN106844412A CN106844412A CN201610949218.0A CN201610949218A CN106844412A CN 106844412 A CN106844412 A CN 106844412A CN 201610949218 A CN201610949218 A CN 201610949218A CN 106844412 A CN106844412 A CN 106844412A
- Authority
- CN
- China
- Prior art keywords
- data
- face data
- human face
- cleaning
- target person
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
- G06F16/162—Delete operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
Abstract
The embodiment of the invention discloses a kind of human face data collection method, method therein includes:Automatic acquisition identifies at least one for associating with target person and refers to character data;By described at least one associated storage is identified with reference to character data with corresponding target person;Automatically clean described at least one and refer to character data, with the target human face data for obtaining being associated with target person mark.The embodiment of the invention also discloses corresponding human face data collection device.Technical scheme provided in an embodiment of the present invention can reduce the cost of human face data collection, save time and manpower that human face data is collected.
Description
Technical field
The present invention relates to field of image recognition, and in particular to a kind of human face data collection method and device.
Background technology
Recognition of face is gradually shown up prominently in every field, and the success of face recognition technology depend on two it is big because
Element, one is to achieve important breakthrough in the depth learning technology of artificial intelligence field so that in image recognition, natural language processing
Application Deng numerous areas achieves immense success;Two be large scale training data can availability so that people can utilize
Depth learning technology, preferably setting up can simulate the neutral net of human brain.
But in face recognition application, large-scale common data sets are lacked always.The method of traditional collection data typically may be used
To be divided into following two methods:1st, it is collected by way of purchase;2nd, it is collected using manual mode, and to being collected into
Data be labeled, clean, will the picture of same person be put into same file folder, and marked No. ID, so that it is guaranteed that
Each file only picture comprising same person.But, no matter any method of data capture has the following disadvantages:Cost
It is higher, lose time, waste of manpower.
The content of the invention
The embodiment of the invention provides a kind of human face data collection method and device, to reduce human face data collection into
This, saves time and manpower that human face data is collected.
Embodiment of the present invention first aspect provides a kind of human face data collection method, including:
Automatic acquisition identifies at least one for associating with target person and refers to character data;
By described at least one associated storage is identified with reference to character data with corresponding target person;
Automatically clean described at least one and refer to character data, with the target person for obtaining being associated with target person mark
Face data.
With reference in a first aspect, in some possible implementations, the automatic acquisition is associated with target person mark
At least one refers to character data, including:
Determine the target person mark;
Automatic acquisition identifies at least one for associating with the target person and refers to character data.
It is described to clean described at least one automatically and refer to personage with reference in a first aspect, in some possible implementations
Data, with the target human face data for obtaining being associated with target person mark, including:
Automatically clean described at least one and refer to character data, so that after cleaning at least one only wraps with reference to character data
Include and refer to human face data, wherein, the reference personage packet is included with reference to human face data and refers to non-face data;
Based on face recognition technology, detect whether reference human face data after the cleaning identifies with the target person
Corresponding benchmark face Data Matching;
If the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not
Match somebody with somebody, delete unmatched reference human face data;
If the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person,
Retain the reference human face data, with the target human face data for obtaining being associated with target person mark.
It is described to clean described at least one automatically and refer to personage with reference in a first aspect, in some possible implementations
Data, so that after cleaning at least one only includes referring to human face data with reference to character data, including:
Whether there is face frame in detecting at least one reference character data based on human face detection tech;
If described at least one with reference to having face frame in character data, retain the reference character data;
If described at least one, with reference to not existing face frame in character data, deletes the reference character data.
It is described based on face recognition technology with reference in a first aspect, in some possible implementations, detect the cleaning
Reference human face data afterwards whether with the target person identify corresponding benchmark face Data Matching, including:
It is determined that the corresponding benchmark face data identified with the target person;
Reference human face data after the cleaning is compared with the benchmark face data;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is more than or equal to
First predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person
Matching;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is less than or equal to
Second predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person
Mismatch.
Embodiment of the present invention second aspect provides a kind of human face data collection device, and described device includes:
Acquiring unit, character data is referred to for automatic acquisition is associated with target person mark at least one;
Memory cell, for identifying associated storage with corresponding target person with reference to character data by described at least one;
Cleaning unit, character data is referred to for cleaning described at least one automatically, to obtain and the target person mark
Know the target human face data of association.
With reference to second aspect, in some possible implementations, the acquiring unit is specifically included:
Determination subelement, for determining the target person mark;
Subelement is obtained, character data is referred to for automatic acquisition is associated with target person mark at least one.
With reference to second aspect, in some possible implementations, the cleaning unit is specifically included:
First cleaning subelement, character data is referred to for cleaning described at least one automatically, so that after cleaning at least
One only includes referring to human face data with reference to character data, wherein, the reference personage packet is included with reference to human face data and ginseng
Examine non-face data;
Second cleaning subelement, based on face recognition technology, detect the reference human face data after the cleaning whether with institute
State the corresponding benchmark face Data Matching of target person mark;If reference human face data and the target person after the cleaning
The corresponding benchmark face data of thing mark are mismatched, and delete unmatched reference human face data;If the reference after the cleaning
The corresponding benchmark face Data Matching that human face data is identified with the target person, retains the reference human face data, with
To the target human face data associated with target person mark.
With reference to second aspect, in some possible implementations, the first cleaning subelement, specifically for based on people
Whether there is face frame in the face detection tech detection at least one reference character data;
If described at least one with reference to having face frame in character data, retain the reference character data;If described
At least one with reference to not existing face frame in character data, then deletes the reference character data.
With reference to second aspect, in some possible implementations, the second cleaning subelement, for based on face
Identification technology, detect the reference human face data after the cleaning whether the corresponding benchmark face number identified with the target person
During according to matching, specifically for the corresponding benchmark face data for determining to be identified with the target person;By the ginseng after the cleaning
Human face data is examined to compare with the benchmark face data;If reference human face data and the benchmark face data after cleaning
The result of comparison is that similarity is more than or equal to the first predetermined threshold value, then the reference human face data after the cleaning and the target
The corresponding benchmark face Data Matching of character recognition and label;If reference human face data and the benchmark face comparing after cleaning
Result be similarity less than or equal to the second predetermined threshold value, then the reference human face data after the cleaning and the target person
The corresponding benchmark face data of mark are mismatched.
The third aspect, the embodiment of the invention provides a kind of human face data collection device, the human face data collection device
Including processor, memory, receiver, transmitter and communication bus, the processor and the memory, the receiver,
The transmitter is connected by the communication bus and completes mutual communication;
The processor is used to call the executable program code stored in the memory, performs such as the embodiment of the present invention
Part or all of step described in first aspect either method.
Fourth aspect, the embodiment of the present invention provides a kind of computer-readable recording medium, wherein, the computer-readable is deposited
Storage media is stored with the program code performed for terminal device, and the program code specifically includes execute instruction, and the execution refers to
Make for performing the part or all of step described in embodiment of the present invention first aspect either method.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically
Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person
Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person
The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements
Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou
Beneficial to setting up face database of certain scale and clean.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are the present invention
Some embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis
These accompanying drawings obtain other accompanying drawings.
Fig. 1 is a kind of schematic flow sheet of human face data collection method that first embodiment of the invention is provided;
Fig. 1-1 is that the effect of preliminary cleaning in a kind of human face data collection method that first embodiment of the invention is provided is illustrated
Figure;
Fig. 1-2 is that the effect of further cleaning in a kind of human face data collection method that first embodiment of the invention is provided is shown
It is intended to;
Fig. 1-3 is the effect of human face data collection in a kind of human face data collection method that first embodiment of the invention is provided
Schematic diagram;
Fig. 2 is a kind of schematic flow sheet of human face data collection method that second embodiment of the invention is provided;
Fig. 3 is a kind of structural representation of human face data collection device that third embodiment of the invention is provided;
Fig. 4 is a kind of structural representation of human face data collection device that fourth embodiment of the invention is provided.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is this hair
Bright a part of embodiment, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not having
There is the every other embodiment made and being obtained under the premise of creative work, belong to the scope of protection of the invention.
Term " first ", " second ", " the 3rd ", " in description and claims of this specification and above-mentioned accompanying drawing
Four " it is etc. for distinguishing different objects, rather than for describing particular order.Additionally, " comprising " and " having " and they appoint
What deforms, it is intended that covering is non-exclusive to be included.For example contain process, method, system, the product of series of steps or unit
Product or equipment are not limited to the step of having listed or unit, but alternatively also include the step of not listing or unit, or
Alternatively also include for these processes, method, product or other intrinsic steps of equipment or unit.
Referenced herein " embodiment " is it is meant that the special characteristic, structure or the characteristic that describe can be wrapped in conjunction with the embodiments
Containing at least one embodiment of the present invention.Each position in the description occur the phrase might not each mean it is identical
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and
Implicitly understand, embodiment described herein can be combined with other embodiments.
Fig. 1 is referred to, Fig. 1 is that a kind of flow of human face data collection method that first embodiment of the invention is provided is illustrated
Figure, as shown in figure 1, the human face data collection method in the embodiment of the present invention is comprised the following steps:
S101, automatic acquisition identify at least one for associating with target person and refer to character data.
First, the target person mark is determined.Specifically, the target person mark can be the personage of target person
Name.For example, Asia name roster is obtained by the beauty star of Baidu's picture, handsome boy star etc., or by interconnection
Net movie database (IMDB) celebrity's roll obtains world-renowned person's list (major part is American-European performer).It is determined that the target person
During mark, can focus in the image collection of famous person and public figure, advantage of this is that, both facilitated in net
The image of specified famous person is got on network, any privacy concern, the infringement problem brought using these pictures can be avoided again.
Secondly, automatic acquisition identifies at least one for associating with the target person and refers to character data.Specifically, obtain
Each character recognition and label corresponding at least one refers to character data, if target person mark can be the personage of target person
Name, then the reference character data can be the picture concerned of the target person.For example, enter for each famous person
Row picture is collected, and picture library that can be directly in major websites in the way of reptile is crawled, such as Google Image
The picture website such as Search, bing, Baidu.
S102, by described at least one with reference to character data and corresponding target person mark associated storage.
Specifically, collect and store by carrying out picture to each famous person respectively, so as to obtain with each celebrity names
It is the file collection of label, wherein contains the picture relevant with the celebrity names in each file.
S103, described at least one is cleaned automatically refer to character data, to obtain what is associated with target person mark
Target human face data.
First, described at least one is cleaned automatically and refer to character data, so that after cleaning at least one refers to personage's number
According to only include refer to human face data, wherein, the reference personage packet include with reference to human face data and refer to non-face data.Tool
Body ground, tentatively improves the purity of each file.In the picture collected on last stage, in each file, image data it is pure
Degree is relatively low, i.e., part picture is non-face picture.In this case, preliminary cleaning need to be carried out to each file, so that
Obtain each file and only include human face data.Solving the method for the problem is, using OpenCV or other comprising Face datection
Kit (such as Hong Kong Chinese University teacher Tang Xiaoou provide 5 program bags of critical point detection) is to the file collection that is collected into
Detected, so as to complete the preliminary cleaning of data.Concrete principle is:The position of face is oriented using face key point
Put, and face frame is returned to according to the position of key point, whether whether face frame is returned according to every pictures to judge this picture
Comprising face.If this picture does not include face, the picture is deleted, otherwise retained, so as to complete the preliminary scavenger of data
Make.Wherein, the effect tentatively cleaned is as Figure 1-1.
Secondly based on face recognition technology, detect the reference human face data after the cleaning whether with the target person mark
The corresponding benchmark face Data Matching known, if the reference human face data after the cleaning is corresponding with what the target person was identified
Benchmark face data mismatch, delete unmatched reference human face data;If reference human face data and institute after the cleaning
The corresponding benchmark face Data Matching of target person mark is stated, retains the reference human face data, to obtain and the target
The target human face data of character recognition and label association.
Specifically, the human face data in each file is further purified.Only include reference man after being cleaned
After the reference character data of face data, the picture for being same person contained with reference to personage's packet is not ensured that, therefore need to be right
This is processed.Retain the target human face data that identical people is identified with target person, delete different from target person mark
Target human face data, so as to complete final data purification work.Specific practice is as follows, using existing recognition of face skill
Art (can such as call Face++, LinkFace, the recognition of face API of Microsoft's offer), and the face in each file is entered
Row face verification, so as to identify target person mark identical facial image corresponding with this document folder, and and this document
Press from both sides corresponding target person and identify different facial images, be marked respectively.Mark phase is extracted in subsequent treatment
Same facial image, deletes the different facial images of mark.Corresponding target person is pressed from both sides with current file to be designated " yaochen "
As a example by, call the recognition of face API of Face++ carries out 1 to the picture in this document folder:N face verifications.Detailed process is as follows, first
The corresponding benchmark face data Img1 that the determination is identified with the target person is first defined, benchmark face data acquisition people is uploaded
Face identifies id1, and then all of picture in file the inside is compared, and all pictures in this document folder are uploaded successively, wherein,
When the benchmark face data are the picture in this document is pressed from both sides, then other in addition to benchmark face data Img1 are uploaded successively
Picture, test pictures obtain mark id2, after calling recognition of face API, will return to the comparison result of id1 and id2.Wherein, program
The result of return is that true or false, true represent same person, and false represents different people, is by returning result
The picture name of " false " is revised as starting with " XXXX_ ", and the picture name for uploading failure will be modified to " YYYY_ "
Beginning.Next the image file started with " XXXX_ " is deleted, and selects the image file started with " YYYY_ " as needed.
Do not set requirement to amount of images even, can directly delete the image file started with " YYYY_ ", and if it is desired to the figure of same person
Piece quantity is The more the better, then the image file started with " YYYY_ " is selected.Wherein, further the effect of cleaning is such as schemed
Shown in 1-2.Finally, the effect diagram that the human face data for being associated with target person mark is collected is as Figure 1-3.
Wherein, the specific embodiment party of at least one reference character data that the automatic acquisition is associated with target person mark
Formula can be:
Determine the target person mark;
Automatic acquisition identifies at least one for associating with the target person and refers to character data.
Wherein, it is described to clean described at least one automatically and refer to character data, to obtain and target person mark is closed
The specific embodiment of the target human face data of connection can be:
Automatically clean described at least one and refer to character data, so that after cleaning at least one only wraps with reference to character data
Include and refer to human face data, wherein, the reference personage packet is included with reference to human face data and refers to non-face data;
Based on face recognition technology, detect whether reference human face data after the cleaning identifies with the target person
Corresponding benchmark face Data Matching;
If the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person,
Retain the reference human face data;
If the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not
Match somebody with somebody, delete unmatched reference human face data, with the target human face data for obtaining being associated with target person mark.
Wherein, it is described to clean described at least one automatically and refer to character data, so that at least one reference man after cleaning
Thing data are only included with reference to the specific embodiment of human face data:
Whether there is face frame in detecting at least one reference character data based on human face detection tech;
If described at least one with reference to having face frame in character data, retain the reference character data;
If described at least one, with reference to not existing face frame in character data, deletes the reference character data.
Wherein, it is described based on face recognition technology, detect the reference human face data after the cleaning whether with the target
The specific embodiment of the corresponding benchmark face Data Matching of character recognition and label can be:
It is determined that the corresponding benchmark face data identified with the target person;
Reference human face data after the cleaning is compared with the benchmark face data;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is more than or equal to
First predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person
Matching;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is less than or equal to
Second predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person
Mismatch.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically
Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person
Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person
The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements
Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou
Beneficial to setting up face database of certain scale and clean.
Fig. 2 is referred to, Fig. 2 is that a kind of flow of human face data collection method that second embodiment of the invention is provided is illustrated
Figure, as shown in Fig. 2 the human face data collection method in the embodiment of the present invention is comprised the following steps:
S201, determine target person mark.
S202, automatic acquisition identify at least one for associating with the target person and refer to character data.
S203, by described at least one with reference to character data and corresponding target person mark associated storage;
S204, based on human face detection tech detection described at least one with reference to whether there is face frame in character data.
Wherein, if detecting at least one reference character data do not exist face frame based on human face detection tech,
Perform step S205;Otherwise, step S206 is performed
If S205, described at least one delete the reference character data with reference to not existing face frame in character data.
If S206, described at least one retain the reference character data with reference to there is face frame in character data.
Wherein, the reference personage packet is included with reference to human face data and refers to non-face data.
The corresponding benchmark face data that S207, determination are identified with the target person.
S208, the reference human face data after the cleaning is compared with the benchmark face data.
Wherein, if reference human face data and the result of the benchmark face comparing after cleaning are similarities be less than or
Equal to the second predetermined threshold value, then step S209 is performed, otherwise, perform step S210.
If reference human face data and the result of the benchmark face comparing after S209, cleaning are similarities be less than or
Equal to the second predetermined threshold value, the then corresponding benchmark face that the reference human face data after the cleaning is identified with the target person
Data are mismatched, and delete unmatched reference human face data.
If reference human face data and the result of the benchmark face comparing after S210, cleaning are similarities be more than or
Equal to the first predetermined threshold value, the then corresponding benchmark face that the reference human face data after the cleaning is identified with the target person
Data Matching, retains the reference human face data.
S211, obtain and the target person target human face data that associates of mark.
Wherein, S101 is extremely during step S201 refer to first embodiment of the invention to the specific implementation of step S211
The associated description of S103.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically
Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person
Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person
The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements
Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou
Beneficial to setting up face database of certain scale and clean.
It is below apparatus of the present invention embodiment, apparatus of the present invention embodiment is used to perform the inventive method embodiment one to two
The method of realization, for convenience of description, illustrate only the part related to the embodiment of the present invention, and particular technique details is not disclosed
, refer to the embodiment of the present invention one and embodiment two.
Fig. 3 is referred to, Fig. 3 is a kind of structural representation of human face data collection device that third embodiment of the invention is provided
Figure, as shown in figure 3, the human face data collection device in the embodiment of the present invention is included with lower unit:
Acquiring unit 301, character data is referred to for automatic acquisition is associated with target person mark at least one;
Memory cell 302, for described at least one to be deposited with reference to character data with the mark association of corresponding target person
Storage;
Cleaning unit 303, character data is referred to for cleaning described at least one automatically, to obtain and the target person
Identify the target human face data of association.
Optionally, the acquiring unit 301, specifically includes:
Determination subelement 3011, for determining the target person mark;
Subelement 3012 is obtained, associated with target person mark at least one is obtained with reference to personage's number for automatic
According to.
Optionally, the cleaning unit 303, specifically includes:
First cleaning subelement 3031, character data is referred to for cleaning described at least one automatically, so that after cleaning
At least one only includes referring to human face data with reference to character data, wherein, the reference personage packet is included and refers to human face data
With the non-face data of reference;
Whether second cleaning subelement 3032, based on face recognition technology, detect the reference human face data after the cleaning
The corresponding benchmark face Data Matching identified with the target person;If reference human face data and the mesh after the cleaning
The corresponding benchmark face data for marking character recognition and label are mismatched, and delete unmatched reference human face data;If after the cleaning
With reference to the corresponding benchmark face Data Matching that human face data and the target person are identified, retain the reference human face data,
With the target human face data for obtaining being associated with target person mark.
Optionally, the first cleaning subelement 3031, specifically for based at least described in human face detection tech detection
Whether there is face frame in individual reference character data;If described at least one, with reference to there is face frame in character data, retains
The reference character data;If described at least one, with reference to not existing face frame in character data, deletes the reference personage
Data.
Optionally, the second cleaning subelement 3032, for based on face recognition technology, after detecting the cleaning
With reference to human face data whether with the target person identify corresponding benchmark face Data Matching when, specifically for determine and institute
State the corresponding benchmark face data of target person mark;By the reference human face data after the cleaning and the benchmark face number
According to comparing;If reference human face data and the result of the benchmark face comparing after cleaning are similarities and being more than or waiting
In the first predetermined threshold value, the then corresponding benchmark face number that the reference human face data after the cleaning is identified with the target person
According to matching;If reference human face data and the result of the benchmark face comparing after cleaning are similarities less than or equal to the
Two predetermined threshold values, then the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not
Matching.
Specifically, above-mentioned unit implement refer to Fig. 1 to Fig. 2 correspondence embodiments in correlation step retouch
State, will not be described here.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically
Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person
Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person
The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements
Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou
Beneficial to setting up face database of certain scale and clean.
Fig. 4 is refer to, Fig. 4 is a kind of structural representation of human face data collection device that fourth embodiment of the invention is provided
Figure.As shown in figure 4, the human face data collection device in the embodiment of the present invention includes:At least one processor 401, such as CPU,
At least one receiver 403, at least one memory 404, at least one transmitter 405, at least one communication bus 402.Its
In, communication bus 402 is used to realize the connection communication between these components.Wherein, in the embodiment of the present invention device receiver
403 and transmitter 405 can be wired sending port, or wireless device, such as including antenna assembly, for other
Node device carries out the communication of signaling or data.Memory 404 can be high-speed RAM memory, or non-labile
Memory (non-volatile memory), for example, at least one magnetic disk storage.Memory 404 optionally can also be to
A few storage device for being located remotely from aforementioned processor 401.Batch processing code, and the treatment are stored in memory 404
Device 401 can call the code stored in memory 404 to perform the function of correlation by communication bus 402.
The processor 401, character data is referred to for automatic acquisition is associated with target person mark at least one;Will
Described at least one identifies associated storage with reference to character data with corresponding target person;Automatically at least one reference is cleaned
Character data, with the target human face data for obtaining being associated with target person mark.
Optionally, the processor 401, at least one reference man associated with target person mark is being obtained for automatic
During thing data, specifically for determining the target person mark;It is automatic to obtain at least associated with target person mark
Individual reference character data.
Optionally, the processor 401, referring to character data for cleaning described at least one automatically, with obtain with
During the target human face data of the target person mark association, personage's number is referred to specifically for cleaning described at least one automatically
According to, so that after cleaning at least one only includes referring to human face data with reference to character data, wherein, the reference personage packet
Include with reference to human face data and refer to non-face data;Based on face recognition technology, the reference human face data after the cleaning is detected
Whether with the target person identify corresponding benchmark face Data Matching;If reference human face data and institute after the cleaning
The corresponding benchmark face data for stating target person mark are mismatched, and delete unmatched reference human face data;If the cleaning
The corresponding benchmark face Data Matching that reference human face data afterwards is identified with the target person, retains the reference face number
According to the target human face data for obtaining being associated with target person mark.
Optionally, the processor 401, is referring to character data, so that cleaning for cleaning described at least one automatically
At least one afterwards only includes during with reference to human face data with reference to character data, specifically for described based on human face detection tech detection
Whether there is face frame at least one reference character data;If described at least one with reference to having face frame in character data,
Then retain the reference character data;If described at least one, with reference to not existing face frame in character data, deletes the ginseng
Examine character data.
Optionally, the processor 401, for based on face recognition technology, detecting the reference face after the cleaning
During the corresponding benchmark face Data Matching whether data identify with the target person, specifically for determining and the target person
The corresponding benchmark face data of thing mark;Reference human face data after the cleaning is compared with the benchmark face data
It is right;If the reference human face data and the result of the benchmark face comparing after cleaning are similarities pre- more than or equal to first
If threshold value, then the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person;
If the reference human face data after cleaning is that similarity is preset less than or equal to second with the result of the benchmark face comparing
Threshold value, the then corresponding benchmark face data mismatch that the reference human face data after the cleaning is identified with the target person.
Specifically, above-mentioned unit implement refer to Fig. 1 to Fig. 2 correspondence embodiments in correlation step retouch
State, will not be described here.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically
Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person
Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person
The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements
Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou
Beneficial to setting up face database of certain scale and clean.
The embodiment of the present invention also provides a kind of computer-readable storage medium, wherein, the computer-readable storage medium can be stored with journey
Sequence, the part of the monitoring method including any service processes described in above method embodiment or full when the program is performed
Portion's step.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as a series of
Combination of actions, but those skilled in the art should know, the present invention not by described by sequence of movement limited because
According to the present invention, some steps can sequentially or simultaneously be carried out using other.Secondly, those skilled in the art should also know
Know, embodiment described in this description belongs to preferred embodiment, involved action and unit is not necessarily of the invention
It is necessary.
The step of method of the embodiment of the present invention, sequentially can according to actual needs be adjusted, merges or delete.This hair
The unit of the terminal of bright embodiment can according to actual needs be integrated, further divide or delete.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not have the portion described in detail in certain embodiment
Point, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed device, can be by another way
Realize.For example, device embodiment described above is schematical, such as the division of described unit is a kind of logic function
Divide, there can be other dividing mode when actually realizing, such as multiple units or component can be combined or be desirably integrated into
Another system, or some features can be ignored, or not perform.It is another, shown or discussed coupling each other or
Direct-coupling or communication connection can be that the INDIRECT COUPLING or communication connection of device or unit can be electricity by some interfaces
Property or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit
The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple
On NE.Some or all of unit therein can be according to the actual needs selected to realize the mesh of this embodiment scheme
's.
In addition, during each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to
It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list
Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or use
When, can store in a computer read/write memory medium.Based on such understanding, technical scheme is substantially
The part for being contributed to prior art in other words or all or part of the technical scheme can be in the form of software products
Embody, the computer software product is stored in a storage medium, including some instructions are used to so that a computer
Equipment (can be personal computer, server or network equipment etc.) perform each embodiment methods described of the invention whole or
Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited
Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes
Medium.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can
Completed with instructing the hardware of correlation by program, the program can be stored in a computer-readable recording medium, storage
Medium can include:Flash disk, read-only storage (English:Read-Only Memory, referred to as:ROM), random access device (English
Text:Random Access Memory, referred to as:RAM), disk or CD etc..
A kind of human face data collection method and device for being provided the embodiment of the present invention above are described in detail, this
Apply specific case in text to be set forth principle of the invention and implementation method, the explanation of above example is only intended to
Help understands the method for the present invention and its core concept;Simultaneously for those of ordinary skill in the art, according to think of of the invention
Think, will change in specific embodiments and applications, in sum, it is right that this specification content should not be construed as
Limitation of the invention.
Claims (10)
1. a kind of human face data collection method, it is characterised in that methods described includes:
Automatic acquisition identifies at least one for associating with target person and refers to character data;
By described at least one associated storage is identified with reference to character data with corresponding target person;
Automatically clean described at least one and refer to character data, with the target face number for obtaining being associated with target person mark
According to.
2. the method for claim 1, it is characterised in that the automatic acquisition associated with target person mark at least one
Individual reference character data, including:
Determine the target person mark;
Automatic acquisition identifies at least one for associating with the target person and refers to character data.
3. the method for claim 1, it is characterised in that described to clean described at least one automatically and refer to character data,
With the target human face data for obtaining being associated with target person mark, including:
Automatically clean described at least one and refer to character data, so that after cleaning at least one only includes ginseng with reference to character data
Human face data is examined, wherein, the reference personage packet is included with reference to human face data and refers to non-face data;
Based on face recognition technology, detect whether the reference human face data after the cleaning is corresponding with what the target person was identified
Benchmark face Data Matching;
If the reference human face data after the cleaning is mismatched with the corresponding benchmark face data that the target person is identified, delete
Except unmatched reference human face data;
If the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person, retain
The reference human face data, with the target human face data for obtaining being associated with target person mark.
4. method as claimed in claim 3, it is characterised in that described to clean described at least one automatically and refer to character data,
So that after cleaning at least one only includes referring to human face data with reference to character data, including:
Whether there is face frame in detecting at least one reference character data based on human face detection tech;
If described at least one with reference to having face frame in character data, retain the reference character data;
If described at least one, with reference to not existing face frame in character data, deletes the reference character data.
5. method as claimed in claim 3, it is characterised in that described based on face recognition technology, after detecting the cleaning
With reference to human face data whether with the target person identify corresponding benchmark face Data Matching, including:
It is determined that the corresponding benchmark face data identified with target person;
Reference human face data after the cleaning is compared with the benchmark face data;
If the reference human face data after cleaning is that similarity is more than or equal to first with the result of the benchmark face comparing
Predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person
Match somebody with somebody;
If the reference human face data after cleaning is that similarity is less than or equal to second with the result of the benchmark face comparing
Predetermined threshold value, then the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not
Match somebody with somebody.
6. a kind of human face data collection device, it is characterised in that described device includes:
Acquiring unit, character data is referred to for automatic acquisition is associated with target person mark at least one;
Memory cell, for identifying associated storage with corresponding target person with reference to character data by described at least one;
Cleaning unit, character data is referred to for cleaning described at least one automatically, is closed with obtaining being identified with the target person
The target human face data of connection.
7. device as claimed in claim 6, it is characterised in that the acquiring unit, specifically includes:
Determination subelement, for determining the target person mark;
Subelement is obtained, character data is referred to for automatic acquisition is associated with target person mark at least one.
8. device as claimed in claim 6, it is characterised in that the cleaning unit, specifically includes:
First cleaning subelement, character data is referred to for cleaning described at least one automatically, so that after cleaning at least one
Only include referring to human face data with reference to character data, wherein, the reference personage packet is included with reference to human face data and with reference to non-
Human face data;
Second cleaning subelement, based on face recognition technology, detect the reference human face data after the cleaning whether with the mesh
Mark the corresponding benchmark face Data Matching of character recognition and label;If reference human face data and the target person mark after the cleaning
The corresponding benchmark face data known are mismatched, and delete unmatched reference human face data;If the reference face after the cleaning
The corresponding benchmark face Data Matching that data are identified with the target person, retains the reference human face data, with obtain with
The target human face data of the target person mark association.
9. device as claimed in claim 8, it is characterised in that
The first cleaning subelement, specifically for based in the human face detection tech detection at least one reference character data
With the presence or absence of face frame;
If described at least one with reference to having face frame in character data, retain the reference character data;If it is described at least
One with reference to not existing face frame in character data, is then deleted the reference character data.
10. device as claimed in claim 8, it is characterised in that
The second cleaning subelement, for based on face recognition technology, detecting that the reference human face data after the cleaning is
It is no identify with the target person corresponding benchmark face Data Matching when, specifically for determining and target person mark
Corresponding benchmark face data;Reference human face data after the cleaning is compared with the benchmark face data;If
Reference human face data after cleaning is that similarity presets threshold more than or equal to first with the result of the benchmark face comparing
Value, the then corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person;If clear
Reference human face data after washing is that similarity is less than or equal to the second predetermined threshold value with the result of the benchmark face comparing,
Then the reference human face data after the cleaning is mismatched with the corresponding benchmark face data that the target person is identified.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610949218.0A CN106844412A (en) | 2016-11-02 | 2016-11-02 | A kind of human face data collection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610949218.0A CN106844412A (en) | 2016-11-02 | 2016-11-02 | A kind of human face data collection method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106844412A true CN106844412A (en) | 2017-06-13 |
Family
ID=59145989
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610949218.0A Pending CN106844412A (en) | 2016-11-02 | 2016-11-02 | A kind of human face data collection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106844412A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108932343A (en) * | 2018-07-24 | 2018-12-04 | 南京甄视智能科技有限公司 | The data set cleaning method and system of face image database |
CN109241310A (en) * | 2018-07-25 | 2019-01-18 | 南京甄视智能科技有限公司 | The data duplicate removal method and system of face image database |
CN110717091A (en) * | 2019-09-16 | 2020-01-21 | 苏宁云计算有限公司 | Entry data expansion method and device based on face recognition |
CN110807108A (en) * | 2019-10-15 | 2020-02-18 | 华南理工大学 | Asian face data automatic collection and cleaning method and system |
CN110826390A (en) * | 2019-09-09 | 2020-02-21 | 博云视觉(北京)科技有限公司 | Video data processing method based on face vector characteristics |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103793697A (en) * | 2014-02-17 | 2014-05-14 | 北京旷视科技有限公司 | Identity labeling method of face images and face identity recognition method of face images |
CN105468760A (en) * | 2015-12-01 | 2016-04-06 | 北京奇虎科技有限公司 | Method and apparatus for labeling face images |
CN105608418A (en) * | 2015-12-16 | 2016-05-25 | 广东欧珀移动通信有限公司 | Picture processing method and device |
-
2016
- 2016-11-02 CN CN201610949218.0A patent/CN106844412A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103793697A (en) * | 2014-02-17 | 2014-05-14 | 北京旷视科技有限公司 | Identity labeling method of face images and face identity recognition method of face images |
CN105468760A (en) * | 2015-12-01 | 2016-04-06 | 北京奇虎科技有限公司 | Method and apparatus for labeling face images |
CN105608418A (en) * | 2015-12-16 | 2016-05-25 | 广东欧珀移动通信有限公司 | Picture processing method and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108932343A (en) * | 2018-07-24 | 2018-12-04 | 南京甄视智能科技有限公司 | The data set cleaning method and system of face image database |
CN108932343B (en) * | 2018-07-24 | 2020-03-27 | 南京甄视智能科技有限公司 | Data set cleaning method and system for human face image database |
CN109241310A (en) * | 2018-07-25 | 2019-01-18 | 南京甄视智能科技有限公司 | The data duplicate removal method and system of face image database |
CN109241310B (en) * | 2018-07-25 | 2020-05-01 | 南京甄视智能科技有限公司 | Data duplication removing method and system for human face image database |
CN110826390A (en) * | 2019-09-09 | 2020-02-21 | 博云视觉(北京)科技有限公司 | Video data processing method based on face vector characteristics |
CN110826390B (en) * | 2019-09-09 | 2023-09-08 | 博云视觉(北京)科技有限公司 | Video data processing method based on face vector characteristics |
CN110717091A (en) * | 2019-09-16 | 2020-01-21 | 苏宁云计算有限公司 | Entry data expansion method and device based on face recognition |
CN110807108A (en) * | 2019-10-15 | 2020-02-18 | 华南理工大学 | Asian face data automatic collection and cleaning method and system |
WO2021072998A1 (en) * | 2019-10-15 | 2021-04-22 | 华南理工大学 | Method and system for automatic collection and cleaning of asian face data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106844412A (en) | A kind of human face data collection method and device | |
CN109189991B (en) | Duplicate video identification method, device, terminal and computer readable storage medium | |
CN112199375B (en) | Cross-modal data processing method and device, storage medium and electronic device | |
WO2019200781A1 (en) | Receipt recognition method and device, and storage medium | |
CN109299258B (en) | Public opinion event detection method, device and equipment | |
CN104915351A (en) | Picture sorting method and terminal | |
CN107943792B (en) | Statement analysis method and device, terminal device and storage medium | |
CN109147769B (en) | Language identification method, language identification device, translation machine, medium and equipment | |
CN104915426B (en) | Information sorting method, the method and device for generating information sorting model | |
CN108009147B (en) | Electronic book cover generation method, electronic device and computer storage medium | |
CN103473285B (en) | Web information extraction method and device based on location markers | |
CN106528655A (en) | Text subject recognition method and device | |
CN104156464A (en) | Micro-video retrieval method and device based on micro-video feature database | |
CN106156794B (en) | Character recognition method and device based on character style recognition | |
CN106650610A (en) | Human face expression data collection method and device | |
WO2020063524A1 (en) | Method and system for determining legal instrument | |
CN110209875A (en) | User content portrait determines method, access object recommendation method and relevant apparatus | |
CN105740903B (en) | More attribute recognition approaches and device | |
CN105893601B (en) | A kind of data comparison method | |
CN108124478A (en) | Picture searching method and apparatus | |
CN107704341A (en) | File access pattern method, apparatus and electronic equipment | |
CN107193941A (en) | Story generation method and device based on picture content | |
CN109033078B (en) | The recognition methods of sentence classification and device, storage medium, processor | |
CN110895555A (en) | Data retrieval method and device, storage medium and electronic device | |
US20170060998A1 (en) | Method and apparatus for mining maximal repeated sequence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170613 |