CN106844412A - A kind of human face data collection method and device - Google Patents

A kind of human face data collection method and device Download PDF

Info

Publication number
CN106844412A
CN106844412A CN201610949218.0A CN201610949218A CN106844412A CN 106844412 A CN106844412 A CN 106844412A CN 201610949218 A CN201610949218 A CN 201610949218A CN 106844412 A CN106844412 A CN 106844412A
Authority
CN
China
Prior art keywords
data
face data
human face
cleaning
target person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610949218.0A
Other languages
Chinese (zh)
Inventor
陈书楷
朱思霖
廖剑安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Zhongkong Biological Recognition Information Technology Co Ltd
Original Assignee
Xiamen Zhongkong Biological Recognition Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Zhongkong Biological Recognition Information Technology Co Ltd filed Critical Xiamen Zhongkong Biological Recognition Information Technology Co Ltd
Priority to CN201610949218.0A priority Critical patent/CN106844412A/en
Publication of CN106844412A publication Critical patent/CN106844412A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions

Abstract

The embodiment of the invention discloses a kind of human face data collection method, method therein includes:Automatic acquisition identifies at least one for associating with target person and refers to character data;By described at least one associated storage is identified with reference to character data with corresponding target person;Automatically clean described at least one and refer to character data, with the target human face data for obtaining being associated with target person mark.The embodiment of the invention also discloses corresponding human face data collection device.Technical scheme provided in an embodiment of the present invention can reduce the cost of human face data collection, save time and manpower that human face data is collected.

Description

A kind of human face data collection method and device
Technical field
The present invention relates to field of image recognition, and in particular to a kind of human face data collection method and device.
Background technology
Recognition of face is gradually shown up prominently in every field, and the success of face recognition technology depend on two it is big because Element, one is to achieve important breakthrough in the depth learning technology of artificial intelligence field so that in image recognition, natural language processing Application Deng numerous areas achieves immense success;Two be large scale training data can availability so that people can utilize Depth learning technology, preferably setting up can simulate the neutral net of human brain.
But in face recognition application, large-scale common data sets are lacked always.The method of traditional collection data typically may be used To be divided into following two methods:1st, it is collected by way of purchase;2nd, it is collected using manual mode, and to being collected into Data be labeled, clean, will the picture of same person be put into same file folder, and marked No. ID, so that it is guaranteed that Each file only picture comprising same person.But, no matter any method of data capture has the following disadvantages:Cost It is higher, lose time, waste of manpower.
The content of the invention
The embodiment of the invention provides a kind of human face data collection method and device, to reduce human face data collection into This, saves time and manpower that human face data is collected.
Embodiment of the present invention first aspect provides a kind of human face data collection method, including:
Automatic acquisition identifies at least one for associating with target person and refers to character data;
By described at least one associated storage is identified with reference to character data with corresponding target person;
Automatically clean described at least one and refer to character data, with the target person for obtaining being associated with target person mark Face data.
With reference in a first aspect, in some possible implementations, the automatic acquisition is associated with target person mark At least one refers to character data, including:
Determine the target person mark;
Automatic acquisition identifies at least one for associating with the target person and refers to character data.
It is described to clean described at least one automatically and refer to personage with reference in a first aspect, in some possible implementations Data, with the target human face data for obtaining being associated with target person mark, including:
Automatically clean described at least one and refer to character data, so that after cleaning at least one only wraps with reference to character data Include and refer to human face data, wherein, the reference personage packet is included with reference to human face data and refers to non-face data;
Based on face recognition technology, detect whether reference human face data after the cleaning identifies with the target person Corresponding benchmark face Data Matching;
If the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not Match somebody with somebody, delete unmatched reference human face data;
If the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person, Retain the reference human face data, with the target human face data for obtaining being associated with target person mark.
It is described to clean described at least one automatically and refer to personage with reference in a first aspect, in some possible implementations Data, so that after cleaning at least one only includes referring to human face data with reference to character data, including:
Whether there is face frame in detecting at least one reference character data based on human face detection tech;
If described at least one with reference to having face frame in character data, retain the reference character data;
If described at least one, with reference to not existing face frame in character data, deletes the reference character data.
It is described based on face recognition technology with reference in a first aspect, in some possible implementations, detect the cleaning Reference human face data afterwards whether with the target person identify corresponding benchmark face Data Matching, including:
It is determined that the corresponding benchmark face data identified with the target person;
Reference human face data after the cleaning is compared with the benchmark face data;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is more than or equal to First predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person Matching;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is less than or equal to Second predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person Mismatch.
Embodiment of the present invention second aspect provides a kind of human face data collection device, and described device includes:
Acquiring unit, character data is referred to for automatic acquisition is associated with target person mark at least one;
Memory cell, for identifying associated storage with corresponding target person with reference to character data by described at least one;
Cleaning unit, character data is referred to for cleaning described at least one automatically, to obtain and the target person mark Know the target human face data of association.
With reference to second aspect, in some possible implementations, the acquiring unit is specifically included:
Determination subelement, for determining the target person mark;
Subelement is obtained, character data is referred to for automatic acquisition is associated with target person mark at least one.
With reference to second aspect, in some possible implementations, the cleaning unit is specifically included:
First cleaning subelement, character data is referred to for cleaning described at least one automatically, so that after cleaning at least One only includes referring to human face data with reference to character data, wherein, the reference personage packet is included with reference to human face data and ginseng Examine non-face data;
Second cleaning subelement, based on face recognition technology, detect the reference human face data after the cleaning whether with institute State the corresponding benchmark face Data Matching of target person mark;If reference human face data and the target person after the cleaning The corresponding benchmark face data of thing mark are mismatched, and delete unmatched reference human face data;If the reference after the cleaning The corresponding benchmark face Data Matching that human face data is identified with the target person, retains the reference human face data, with To the target human face data associated with target person mark.
With reference to second aspect, in some possible implementations, the first cleaning subelement, specifically for based on people Whether there is face frame in the face detection tech detection at least one reference character data;
If described at least one with reference to having face frame in character data, retain the reference character data;If described At least one with reference to not existing face frame in character data, then deletes the reference character data.
With reference to second aspect, in some possible implementations, the second cleaning subelement, for based on face Identification technology, detect the reference human face data after the cleaning whether the corresponding benchmark face number identified with the target person During according to matching, specifically for the corresponding benchmark face data for determining to be identified with the target person;By the ginseng after the cleaning Human face data is examined to compare with the benchmark face data;If reference human face data and the benchmark face data after cleaning The result of comparison is that similarity is more than or equal to the first predetermined threshold value, then the reference human face data after the cleaning and the target The corresponding benchmark face Data Matching of character recognition and label;If reference human face data and the benchmark face comparing after cleaning Result be similarity less than or equal to the second predetermined threshold value, then the reference human face data after the cleaning and the target person The corresponding benchmark face data of mark are mismatched.
The third aspect, the embodiment of the invention provides a kind of human face data collection device, the human face data collection device Including processor, memory, receiver, transmitter and communication bus, the processor and the memory, the receiver, The transmitter is connected by the communication bus and completes mutual communication;
The processor is used to call the executable program code stored in the memory, performs such as the embodiment of the present invention Part or all of step described in first aspect either method.
Fourth aspect, the embodiment of the present invention provides a kind of computer-readable recording medium, wherein, the computer-readable is deposited Storage media is stored with the program code performed for terminal device, and the program code specifically includes execute instruction, and the execution refers to Make for performing the part or all of step described in embodiment of the present invention first aspect either method.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou Beneficial to setting up face database of certain scale and clean.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing The accompanying drawing to be used needed for having technology description is briefly described, it should be apparent that, drawings in the following description are the present invention Some embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can also basis These accompanying drawings obtain other accompanying drawings.
Fig. 1 is a kind of schematic flow sheet of human face data collection method that first embodiment of the invention is provided;
Fig. 1-1 is that the effect of preliminary cleaning in a kind of human face data collection method that first embodiment of the invention is provided is illustrated Figure;
Fig. 1-2 is that the effect of further cleaning in a kind of human face data collection method that first embodiment of the invention is provided is shown It is intended to;
Fig. 1-3 is the effect of human face data collection in a kind of human face data collection method that first embodiment of the invention is provided Schematic diagram;
Fig. 2 is a kind of schematic flow sheet of human face data collection method that second embodiment of the invention is provided;
Fig. 3 is a kind of structural representation of human face data collection device that third embodiment of the invention is provided;
Fig. 4 is a kind of structural representation of human face data collection device that fourth embodiment of the invention is provided.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Accompanying drawing, is clearly and completely described to the technical scheme in the embodiment of the present invention, it is clear that described embodiment is this hair Bright a part of embodiment, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not having There is the every other embodiment made and being obtained under the premise of creative work, belong to the scope of protection of the invention.
Term " first ", " second ", " the 3rd ", " in description and claims of this specification and above-mentioned accompanying drawing Four " it is etc. for distinguishing different objects, rather than for describing particular order.Additionally, " comprising " and " having " and they appoint What deforms, it is intended that covering is non-exclusive to be included.For example contain process, method, system, the product of series of steps or unit Product or equipment are not limited to the step of having listed or unit, but alternatively also include the step of not listing or unit, or Alternatively also include for these processes, method, product or other intrinsic steps of equipment or unit.
Referenced herein " embodiment " is it is meant that the special characteristic, structure or the characteristic that describe can be wrapped in conjunction with the embodiments Containing at least one embodiment of the present invention.Each position in the description occur the phrase might not each mean it is identical Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and Implicitly understand, embodiment described herein can be combined with other embodiments.
Fig. 1 is referred to, Fig. 1 is that a kind of flow of human face data collection method that first embodiment of the invention is provided is illustrated Figure, as shown in figure 1, the human face data collection method in the embodiment of the present invention is comprised the following steps:
S101, automatic acquisition identify at least one for associating with target person and refer to character data.
First, the target person mark is determined.Specifically, the target person mark can be the personage of target person Name.For example, Asia name roster is obtained by the beauty star of Baidu's picture, handsome boy star etc., or by interconnection Net movie database (IMDB) celebrity's roll obtains world-renowned person's list (major part is American-European performer).It is determined that the target person During mark, can focus in the image collection of famous person and public figure, advantage of this is that, both facilitated in net The image of specified famous person is got on network, any privacy concern, the infringement problem brought using these pictures can be avoided again.
Secondly, automatic acquisition identifies at least one for associating with the target person and refers to character data.Specifically, obtain Each character recognition and label corresponding at least one refers to character data, if target person mark can be the personage of target person Name, then the reference character data can be the picture concerned of the target person.For example, enter for each famous person Row picture is collected, and picture library that can be directly in major websites in the way of reptile is crawled, such as Google Image The picture website such as Search, bing, Baidu.
S102, by described at least one with reference to character data and corresponding target person mark associated storage.
Specifically, collect and store by carrying out picture to each famous person respectively, so as to obtain with each celebrity names It is the file collection of label, wherein contains the picture relevant with the celebrity names in each file.
S103, described at least one is cleaned automatically refer to character data, to obtain what is associated with target person mark Target human face data.
First, described at least one is cleaned automatically and refer to character data, so that after cleaning at least one refers to personage's number According to only include refer to human face data, wherein, the reference personage packet include with reference to human face data and refer to non-face data.Tool Body ground, tentatively improves the purity of each file.In the picture collected on last stage, in each file, image data it is pure Degree is relatively low, i.e., part picture is non-face picture.In this case, preliminary cleaning need to be carried out to each file, so that Obtain each file and only include human face data.Solving the method for the problem is, using OpenCV or other comprising Face datection Kit (such as Hong Kong Chinese University teacher Tang Xiaoou provide 5 program bags of critical point detection) is to the file collection that is collected into Detected, so as to complete the preliminary cleaning of data.Concrete principle is:The position of face is oriented using face key point Put, and face frame is returned to according to the position of key point, whether whether face frame is returned according to every pictures to judge this picture Comprising face.If this picture does not include face, the picture is deleted, otherwise retained, so as to complete the preliminary scavenger of data Make.Wherein, the effect tentatively cleaned is as Figure 1-1.
Secondly based on face recognition technology, detect the reference human face data after the cleaning whether with the target person mark The corresponding benchmark face Data Matching known, if the reference human face data after the cleaning is corresponding with what the target person was identified Benchmark face data mismatch, delete unmatched reference human face data;If reference human face data and institute after the cleaning The corresponding benchmark face Data Matching of target person mark is stated, retains the reference human face data, to obtain and the target The target human face data of character recognition and label association.
Specifically, the human face data in each file is further purified.Only include reference man after being cleaned After the reference character data of face data, the picture for being same person contained with reference to personage's packet is not ensured that, therefore need to be right This is processed.Retain the target human face data that identical people is identified with target person, delete different from target person mark Target human face data, so as to complete final data purification work.Specific practice is as follows, using existing recognition of face skill Art (can such as call Face++, LinkFace, the recognition of face API of Microsoft's offer), and the face in each file is entered Row face verification, so as to identify target person mark identical facial image corresponding with this document folder, and and this document Press from both sides corresponding target person and identify different facial images, be marked respectively.Mark phase is extracted in subsequent treatment Same facial image, deletes the different facial images of mark.Corresponding target person is pressed from both sides with current file to be designated " yaochen " As a example by, call the recognition of face API of Face++ carries out 1 to the picture in this document folder:N face verifications.Detailed process is as follows, first The corresponding benchmark face data Img1 that the determination is identified with the target person is first defined, benchmark face data acquisition people is uploaded Face identifies id1, and then all of picture in file the inside is compared, and all pictures in this document folder are uploaded successively, wherein, When the benchmark face data are the picture in this document is pressed from both sides, then other in addition to benchmark face data Img1 are uploaded successively Picture, test pictures obtain mark id2, after calling recognition of face API, will return to the comparison result of id1 and id2.Wherein, program The result of return is that true or false, true represent same person, and false represents different people, is by returning result The picture name of " false " is revised as starting with " XXXX_ ", and the picture name for uploading failure will be modified to " YYYY_ " Beginning.Next the image file started with " XXXX_ " is deleted, and selects the image file started with " YYYY_ " as needed. Do not set requirement to amount of images even, can directly delete the image file started with " YYYY_ ", and if it is desired to the figure of same person Piece quantity is The more the better, then the image file started with " YYYY_ " is selected.Wherein, further the effect of cleaning is such as schemed Shown in 1-2.Finally, the effect diagram that the human face data for being associated with target person mark is collected is as Figure 1-3.
Wherein, the specific embodiment party of at least one reference character data that the automatic acquisition is associated with target person mark Formula can be:
Determine the target person mark;
Automatic acquisition identifies at least one for associating with the target person and refers to character data.
Wherein, it is described to clean described at least one automatically and refer to character data, to obtain and target person mark is closed The specific embodiment of the target human face data of connection can be:
Automatically clean described at least one and refer to character data, so that after cleaning at least one only wraps with reference to character data Include and refer to human face data, wherein, the reference personage packet is included with reference to human face data and refers to non-face data;
Based on face recognition technology, detect whether reference human face data after the cleaning identifies with the target person Corresponding benchmark face Data Matching;
If the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person, Retain the reference human face data;
If the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not Match somebody with somebody, delete unmatched reference human face data, with the target human face data for obtaining being associated with target person mark.
Wherein, it is described to clean described at least one automatically and refer to character data, so that at least one reference man after cleaning Thing data are only included with reference to the specific embodiment of human face data:
Whether there is face frame in detecting at least one reference character data based on human face detection tech;
If described at least one with reference to having face frame in character data, retain the reference character data;
If described at least one, with reference to not existing face frame in character data, deletes the reference character data.
Wherein, it is described based on face recognition technology, detect the reference human face data after the cleaning whether with the target The specific embodiment of the corresponding benchmark face Data Matching of character recognition and label can be:
It is determined that the corresponding benchmark face data identified with the target person;
Reference human face data after the cleaning is compared with the benchmark face data;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is more than or equal to First predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person Matching;
If the reference human face data and the result of the benchmark face comparing after cleaning are that similarity is less than or equal to Second predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person Mismatch.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou Beneficial to setting up face database of certain scale and clean.
Fig. 2 is referred to, Fig. 2 is that a kind of flow of human face data collection method that second embodiment of the invention is provided is illustrated Figure, as shown in Fig. 2 the human face data collection method in the embodiment of the present invention is comprised the following steps:
S201, determine target person mark.
S202, automatic acquisition identify at least one for associating with the target person and refer to character data.
S203, by described at least one with reference to character data and corresponding target person mark associated storage;
S204, based on human face detection tech detection described at least one with reference to whether there is face frame in character data.
Wherein, if detecting at least one reference character data do not exist face frame based on human face detection tech, Perform step S205;Otherwise, step S206 is performed
If S205, described at least one delete the reference character data with reference to not existing face frame in character data.
If S206, described at least one retain the reference character data with reference to there is face frame in character data.
Wherein, the reference personage packet is included with reference to human face data and refers to non-face data.
The corresponding benchmark face data that S207, determination are identified with the target person.
S208, the reference human face data after the cleaning is compared with the benchmark face data.
Wherein, if reference human face data and the result of the benchmark face comparing after cleaning are similarities be less than or Equal to the second predetermined threshold value, then step S209 is performed, otherwise, perform step S210.
If reference human face data and the result of the benchmark face comparing after S209, cleaning are similarities be less than or Equal to the second predetermined threshold value, the then corresponding benchmark face that the reference human face data after the cleaning is identified with the target person Data are mismatched, and delete unmatched reference human face data.
If reference human face data and the result of the benchmark face comparing after S210, cleaning are similarities be more than or Equal to the first predetermined threshold value, the then corresponding benchmark face that the reference human face data after the cleaning is identified with the target person Data Matching, retains the reference human face data.
S211, obtain and the target person target human face data that associates of mark.
Wherein, S101 is extremely during step S201 refer to first embodiment of the invention to the specific implementation of step S211 The associated description of S103.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou Beneficial to setting up face database of certain scale and clean.
It is below apparatus of the present invention embodiment, apparatus of the present invention embodiment is used to perform the inventive method embodiment one to two The method of realization, for convenience of description, illustrate only the part related to the embodiment of the present invention, and particular technique details is not disclosed , refer to the embodiment of the present invention one and embodiment two.
Fig. 3 is referred to, Fig. 3 is a kind of structural representation of human face data collection device that third embodiment of the invention is provided Figure, as shown in figure 3, the human face data collection device in the embodiment of the present invention is included with lower unit:
Acquiring unit 301, character data is referred to for automatic acquisition is associated with target person mark at least one;
Memory cell 302, for described at least one to be deposited with reference to character data with the mark association of corresponding target person Storage;
Cleaning unit 303, character data is referred to for cleaning described at least one automatically, to obtain and the target person Identify the target human face data of association.
Optionally, the acquiring unit 301, specifically includes:
Determination subelement 3011, for determining the target person mark;
Subelement 3012 is obtained, associated with target person mark at least one is obtained with reference to personage's number for automatic According to.
Optionally, the cleaning unit 303, specifically includes:
First cleaning subelement 3031, character data is referred to for cleaning described at least one automatically, so that after cleaning At least one only includes referring to human face data with reference to character data, wherein, the reference personage packet is included and refers to human face data With the non-face data of reference;
Whether second cleaning subelement 3032, based on face recognition technology, detect the reference human face data after the cleaning The corresponding benchmark face Data Matching identified with the target person;If reference human face data and the mesh after the cleaning The corresponding benchmark face data for marking character recognition and label are mismatched, and delete unmatched reference human face data;If after the cleaning With reference to the corresponding benchmark face Data Matching that human face data and the target person are identified, retain the reference human face data, With the target human face data for obtaining being associated with target person mark.
Optionally, the first cleaning subelement 3031, specifically for based at least described in human face detection tech detection Whether there is face frame in individual reference character data;If described at least one, with reference to there is face frame in character data, retains The reference character data;If described at least one, with reference to not existing face frame in character data, deletes the reference personage Data.
Optionally, the second cleaning subelement 3032, for based on face recognition technology, after detecting the cleaning With reference to human face data whether with the target person identify corresponding benchmark face Data Matching when, specifically for determine and institute State the corresponding benchmark face data of target person mark;By the reference human face data after the cleaning and the benchmark face number According to comparing;If reference human face data and the result of the benchmark face comparing after cleaning are similarities and being more than or waiting In the first predetermined threshold value, the then corresponding benchmark face number that the reference human face data after the cleaning is identified with the target person According to matching;If reference human face data and the result of the benchmark face comparing after cleaning are similarities less than or equal to the Two predetermined threshold values, then the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not Matching.
Specifically, above-mentioned unit implement refer to Fig. 1 to Fig. 2 correspondence embodiments in correlation step retouch State, will not be described here.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou Beneficial to setting up face database of certain scale and clean.
Fig. 4 is refer to, Fig. 4 is a kind of structural representation of human face data collection device that fourth embodiment of the invention is provided Figure.As shown in figure 4, the human face data collection device in the embodiment of the present invention includes:At least one processor 401, such as CPU, At least one receiver 403, at least one memory 404, at least one transmitter 405, at least one communication bus 402.Its In, communication bus 402 is used to realize the connection communication between these components.Wherein, in the embodiment of the present invention device receiver 403 and transmitter 405 can be wired sending port, or wireless device, such as including antenna assembly, for other Node device carries out the communication of signaling or data.Memory 404 can be high-speed RAM memory, or non-labile Memory (non-volatile memory), for example, at least one magnetic disk storage.Memory 404 optionally can also be to A few storage device for being located remotely from aforementioned processor 401.Batch processing code, and the treatment are stored in memory 404 Device 401 can call the code stored in memory 404 to perform the function of correlation by communication bus 402.
The processor 401, character data is referred to for automatic acquisition is associated with target person mark at least one;Will Described at least one identifies associated storage with reference to character data with corresponding target person;Automatically at least one reference is cleaned Character data, with the target human face data for obtaining being associated with target person mark.
Optionally, the processor 401, at least one reference man associated with target person mark is being obtained for automatic During thing data, specifically for determining the target person mark;It is automatic to obtain at least associated with target person mark Individual reference character data.
Optionally, the processor 401, referring to character data for cleaning described at least one automatically, with obtain with During the target human face data of the target person mark association, personage's number is referred to specifically for cleaning described at least one automatically According to, so that after cleaning at least one only includes referring to human face data with reference to character data, wherein, the reference personage packet Include with reference to human face data and refer to non-face data;Based on face recognition technology, the reference human face data after the cleaning is detected Whether with the target person identify corresponding benchmark face Data Matching;If reference human face data and institute after the cleaning The corresponding benchmark face data for stating target person mark are mismatched, and delete unmatched reference human face data;If the cleaning The corresponding benchmark face Data Matching that reference human face data afterwards is identified with the target person, retains the reference face number According to the target human face data for obtaining being associated with target person mark.
Optionally, the processor 401, is referring to character data, so that cleaning for cleaning described at least one automatically At least one afterwards only includes during with reference to human face data with reference to character data, specifically for described based on human face detection tech detection Whether there is face frame at least one reference character data;If described at least one with reference to having face frame in character data, Then retain the reference character data;If described at least one, with reference to not existing face frame in character data, deletes the ginseng Examine character data.
Optionally, the processor 401, for based on face recognition technology, detecting the reference face after the cleaning During the corresponding benchmark face Data Matching whether data identify with the target person, specifically for determining and the target person The corresponding benchmark face data of thing mark;Reference human face data after the cleaning is compared with the benchmark face data It is right;If the reference human face data and the result of the benchmark face comparing after cleaning are similarities pre- more than or equal to first If threshold value, then the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person; If the reference human face data after cleaning is that similarity is preset less than or equal to second with the result of the benchmark face comparing Threshold value, the then corresponding benchmark face data mismatch that the reference human face data after the cleaning is identified with the target person.
Specifically, above-mentioned unit implement refer to Fig. 1 to Fig. 2 correspondence embodiments in correlation step retouch State, will not be described here.
As can be seen that in embodiment of the present invention technical scheme, human face data collection device is obtained and target person mark automatically Know at least one of association and refer to character data, and described at least one is identified with reference to character data with corresponding target person Associated storage, is then cleaned automatically to described at least one with reference to character data, to obtain being identified with the target person The target human face data of association.By implementing the mesh that the embodiment of the present invention can be by the method for automation to meeting certain requirements Mark human face data is collected, and reduces the cost that human face data is collected, and saves time and manpower that human face data is collected, Jin Eryou Beneficial to setting up face database of certain scale and clean.
The embodiment of the present invention also provides a kind of computer-readable storage medium, wherein, the computer-readable storage medium can be stored with journey Sequence, the part of the monitoring method including any service processes described in above method embodiment or full when the program is performed Portion's step.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should know, the present invention not by described by sequence of movement limited because According to the present invention, some steps can sequentially or simultaneously be carried out using other.Secondly, those skilled in the art should also know Know, embodiment described in this description belongs to preferred embodiment, involved action and unit is not necessarily of the invention It is necessary.
The step of method of the embodiment of the present invention, sequentially can according to actual needs be adjusted, merges or delete.This hair The unit of the terminal of bright embodiment can according to actual needs be integrated, further divide or delete.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not have the portion described in detail in certain embodiment Point, may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed device, can be by another way Realize.For example, device embodiment described above is schematical, such as the division of described unit is a kind of logic function Divide, there can be other dividing mode when actually realizing, such as multiple units or component can be combined or be desirably integrated into Another system, or some features can be ignored, or not perform.It is another, shown or discussed coupling each other or Direct-coupling or communication connection can be that the INDIRECT COUPLING or communication connection of device or unit can be electricity by some interfaces Property or other forms.
The unit that is illustrated as separating component can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be according to the actual needs selected to realize the mesh of this embodiment scheme 's.
In addition, during each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list Unit can both be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If the integrated unit is to realize in the form of SFU software functional unit and as independent production marketing or use When, can store in a computer read/write memory medium.Based on such understanding, technical scheme is substantially The part for being contributed to prior art in other words or all or part of the technical scheme can be in the form of software products Embody, the computer software product is stored in a storage medium, including some instructions are used to so that a computer Equipment (can be personal computer, server or network equipment etc.) perform each embodiment methods described of the invention whole or Part steps.And foregoing storage medium includes:USB flash disk, read-only storage (ROM, Read-Only Memory), arbitrary access are deposited Reservoir (RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various can be with store program codes Medium.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can Completed with instructing the hardware of correlation by program, the program can be stored in a computer-readable recording medium, storage Medium can include:Flash disk, read-only storage (English:Read-Only Memory, referred to as:ROM), random access device (English Text:Random Access Memory, referred to as:RAM), disk or CD etc..
A kind of human face data collection method and device for being provided the embodiment of the present invention above are described in detail, this Apply specific case in text to be set forth principle of the invention and implementation method, the explanation of above example is only intended to Help understands the method for the present invention and its core concept;Simultaneously for those of ordinary skill in the art, according to think of of the invention Think, will change in specific embodiments and applications, in sum, it is right that this specification content should not be construed as Limitation of the invention.

Claims (10)

1. a kind of human face data collection method, it is characterised in that methods described includes:
Automatic acquisition identifies at least one for associating with target person and refers to character data;
By described at least one associated storage is identified with reference to character data with corresponding target person;
Automatically clean described at least one and refer to character data, with the target face number for obtaining being associated with target person mark According to.
2. the method for claim 1, it is characterised in that the automatic acquisition associated with target person mark at least one Individual reference character data, including:
Determine the target person mark;
Automatic acquisition identifies at least one for associating with the target person and refers to character data.
3. the method for claim 1, it is characterised in that described to clean described at least one automatically and refer to character data, With the target human face data for obtaining being associated with target person mark, including:
Automatically clean described at least one and refer to character data, so that after cleaning at least one only includes ginseng with reference to character data Human face data is examined, wherein, the reference personage packet is included with reference to human face data and refers to non-face data;
Based on face recognition technology, detect whether the reference human face data after the cleaning is corresponding with what the target person was identified Benchmark face Data Matching;
If the reference human face data after the cleaning is mismatched with the corresponding benchmark face data that the target person is identified, delete Except unmatched reference human face data;
If the corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person, retain The reference human face data, with the target human face data for obtaining being associated with target person mark.
4. method as claimed in claim 3, it is characterised in that described to clean described at least one automatically and refer to character data, So that after cleaning at least one only includes referring to human face data with reference to character data, including:
Whether there is face frame in detecting at least one reference character data based on human face detection tech;
If described at least one with reference to having face frame in character data, retain the reference character data;
If described at least one, with reference to not existing face frame in character data, deletes the reference character data.
5. method as claimed in claim 3, it is characterised in that described based on face recognition technology, after detecting the cleaning With reference to human face data whether with the target person identify corresponding benchmark face Data Matching, including:
It is determined that the corresponding benchmark face data identified with target person;
Reference human face data after the cleaning is compared with the benchmark face data;
If the reference human face data after cleaning is that similarity is more than or equal to first with the result of the benchmark face comparing Predetermined threshold value, the then corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person Match somebody with somebody;
If the reference human face data after cleaning is that similarity is less than or equal to second with the result of the benchmark face comparing Predetermined threshold value, then the corresponding benchmark face data that the reference human face data after the cleaning is identified with the target person are not Match somebody with somebody.
6. a kind of human face data collection device, it is characterised in that described device includes:
Acquiring unit, character data is referred to for automatic acquisition is associated with target person mark at least one;
Memory cell, for identifying associated storage with corresponding target person with reference to character data by described at least one;
Cleaning unit, character data is referred to for cleaning described at least one automatically, is closed with obtaining being identified with the target person The target human face data of connection.
7. device as claimed in claim 6, it is characterised in that the acquiring unit, specifically includes:
Determination subelement, for determining the target person mark;
Subelement is obtained, character data is referred to for automatic acquisition is associated with target person mark at least one.
8. device as claimed in claim 6, it is characterised in that the cleaning unit, specifically includes:
First cleaning subelement, character data is referred to for cleaning described at least one automatically, so that after cleaning at least one Only include referring to human face data with reference to character data, wherein, the reference personage packet is included with reference to human face data and with reference to non- Human face data;
Second cleaning subelement, based on face recognition technology, detect the reference human face data after the cleaning whether with the mesh Mark the corresponding benchmark face Data Matching of character recognition and label;If reference human face data and the target person mark after the cleaning The corresponding benchmark face data known are mismatched, and delete unmatched reference human face data;If the reference face after the cleaning The corresponding benchmark face Data Matching that data are identified with the target person, retains the reference human face data, with obtain with The target human face data of the target person mark association.
9. device as claimed in claim 8, it is characterised in that
The first cleaning subelement, specifically for based in the human face detection tech detection at least one reference character data With the presence or absence of face frame;
If described at least one with reference to having face frame in character data, retain the reference character data;If it is described at least One with reference to not existing face frame in character data, is then deleted the reference character data.
10. device as claimed in claim 8, it is characterised in that
The second cleaning subelement, for based on face recognition technology, detecting that the reference human face data after the cleaning is It is no identify with the target person corresponding benchmark face Data Matching when, specifically for determining and target person mark Corresponding benchmark face data;Reference human face data after the cleaning is compared with the benchmark face data;If Reference human face data after cleaning is that similarity presets threshold more than or equal to first with the result of the benchmark face comparing Value, the then corresponding benchmark face Data Matching that the reference human face data after the cleaning is identified with the target person;If clear Reference human face data after washing is that similarity is less than or equal to the second predetermined threshold value with the result of the benchmark face comparing, Then the reference human face data after the cleaning is mismatched with the corresponding benchmark face data that the target person is identified.
CN201610949218.0A 2016-11-02 2016-11-02 A kind of human face data collection method and device Pending CN106844412A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610949218.0A CN106844412A (en) 2016-11-02 2016-11-02 A kind of human face data collection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610949218.0A CN106844412A (en) 2016-11-02 2016-11-02 A kind of human face data collection method and device

Publications (1)

Publication Number Publication Date
CN106844412A true CN106844412A (en) 2017-06-13

Family

ID=59145989

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610949218.0A Pending CN106844412A (en) 2016-11-02 2016-11-02 A kind of human face data collection method and device

Country Status (1)

Country Link
CN (1) CN106844412A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932343A (en) * 2018-07-24 2018-12-04 南京甄视智能科技有限公司 The data set cleaning method and system of face image database
CN109241310A (en) * 2018-07-25 2019-01-18 南京甄视智能科技有限公司 The data duplicate removal method and system of face image database
CN110717091A (en) * 2019-09-16 2020-01-21 苏宁云计算有限公司 Entry data expansion method and device based on face recognition
CN110807108A (en) * 2019-10-15 2020-02-18 华南理工大学 Asian face data automatic collection and cleaning method and system
CN110826390A (en) * 2019-09-09 2020-02-21 博云视觉(北京)科技有限公司 Video data processing method based on face vector characteristics

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793697A (en) * 2014-02-17 2014-05-14 北京旷视科技有限公司 Identity labeling method of face images and face identity recognition method of face images
CN105468760A (en) * 2015-12-01 2016-04-06 北京奇虎科技有限公司 Method and apparatus for labeling face images
CN105608418A (en) * 2015-12-16 2016-05-25 广东欧珀移动通信有限公司 Picture processing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103793697A (en) * 2014-02-17 2014-05-14 北京旷视科技有限公司 Identity labeling method of face images and face identity recognition method of face images
CN105468760A (en) * 2015-12-01 2016-04-06 北京奇虎科技有限公司 Method and apparatus for labeling face images
CN105608418A (en) * 2015-12-16 2016-05-25 广东欧珀移动通信有限公司 Picture processing method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932343A (en) * 2018-07-24 2018-12-04 南京甄视智能科技有限公司 The data set cleaning method and system of face image database
CN108932343B (en) * 2018-07-24 2020-03-27 南京甄视智能科技有限公司 Data set cleaning method and system for human face image database
CN109241310A (en) * 2018-07-25 2019-01-18 南京甄视智能科技有限公司 The data duplicate removal method and system of face image database
CN109241310B (en) * 2018-07-25 2020-05-01 南京甄视智能科技有限公司 Data duplication removing method and system for human face image database
CN110826390A (en) * 2019-09-09 2020-02-21 博云视觉(北京)科技有限公司 Video data processing method based on face vector characteristics
CN110826390B (en) * 2019-09-09 2023-09-08 博云视觉(北京)科技有限公司 Video data processing method based on face vector characteristics
CN110717091A (en) * 2019-09-16 2020-01-21 苏宁云计算有限公司 Entry data expansion method and device based on face recognition
CN110807108A (en) * 2019-10-15 2020-02-18 华南理工大学 Asian face data automatic collection and cleaning method and system
WO2021072998A1 (en) * 2019-10-15 2021-04-22 华南理工大学 Method and system for automatic collection and cleaning of asian face data

Similar Documents

Publication Publication Date Title
CN106844412A (en) A kind of human face data collection method and device
CN109189991B (en) Duplicate video identification method, device, terminal and computer readable storage medium
CN112199375B (en) Cross-modal data processing method and device, storage medium and electronic device
WO2019200781A1 (en) Receipt recognition method and device, and storage medium
CN109299258B (en) Public opinion event detection method, device and equipment
CN104915351A (en) Picture sorting method and terminal
CN107943792B (en) Statement analysis method and device, terminal device and storage medium
CN109147769B (en) Language identification method, language identification device, translation machine, medium and equipment
CN104915426B (en) Information sorting method, the method and device for generating information sorting model
CN108009147B (en) Electronic book cover generation method, electronic device and computer storage medium
CN103473285B (en) Web information extraction method and device based on location markers
CN106528655A (en) Text subject recognition method and device
CN104156464A (en) Micro-video retrieval method and device based on micro-video feature database
CN106156794B (en) Character recognition method and device based on character style recognition
CN106650610A (en) Human face expression data collection method and device
WO2020063524A1 (en) Method and system for determining legal instrument
CN110209875A (en) User content portrait determines method, access object recommendation method and relevant apparatus
CN105740903B (en) More attribute recognition approaches and device
CN105893601B (en) A kind of data comparison method
CN108124478A (en) Picture searching method and apparatus
CN107704341A (en) File access pattern method, apparatus and electronic equipment
CN107193941A (en) Story generation method and device based on picture content
CN109033078B (en) The recognition methods of sentence classification and device, storage medium, processor
CN110895555A (en) Data retrieval method and device, storage medium and electronic device
US20170060998A1 (en) Method and apparatus for mining maximal repeated sequence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170613