CN105138977A - Face identification method under big data environment - Google Patents

Face identification method under big data environment Download PDF

Info

Publication number
CN105138977A
CN105138977A CN201510507364.3A CN201510507364A CN105138977A CN 105138977 A CN105138977 A CN 105138977A CN 201510507364 A CN201510507364 A CN 201510507364A CN 105138977 A CN105138977 A CN 105138977A
Authority
CN
China
Prior art keywords
image
face
images
module
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510507364.3A
Other languages
Chinese (zh)
Inventor
许驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHENGDU DINGZHIHUI SCIENCE AND TECHNOLOGY Co Ltd
Original Assignee
CHENGDU DINGZHIHUI SCIENCE AND TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHENGDU DINGZHIHUI SCIENCE AND TECHNOLOGY Co Ltd filed Critical CHENGDU DINGZHIHUI SCIENCE AND TECHNOLOGY Co Ltd
Priority to CN201510507364.3A priority Critical patent/CN105138977A/en
Publication of CN105138977A publication Critical patent/CN105138977A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a face identification method under big data environment. The method comprises the step S100: collecting original face images; the step S200: extracting training set images; the step S300: training a classifier; the step S400: labeling images; and the step S500: performing face identification. The face identification method under big data environment utilizes the distributed cloud computing mode to quickly and effectively acquire the face image data on the network and effectively improve the efficiency and the accuracy for image detection.

Description

Face recognition method in big data environment
Technical Field
The invention relates to the field of image processing, in particular to a face recognition method in a big data environment.
Background
The biological characteristic recognition technology is a new type of recognition technology combining biological technology with information technology, and combines computer technology, optics, acoustics, biosensor, biometry and other high-tech means closely to identify the identity with the physiological characteristic or behavior characteristic of human body. At present, a security authentication method using biometric identification technology has become an indispensable means for security authentication. The biological and behavioral characteristics of human body are the inherent characteristics of human, and the identification method is an ideal identity authentication mode for distinguishing different individuals and is also suitable for authentication application with high safety intensity requirement.
As a typical biological characteristic identification mode, the face recognition becomes an important research direction in the field of pattern recognition, and has a wide application prospect. In recent years, the rapid development of the mobile internet also generates new requirements for the application of face recognition, but the traditional face recognition method has high requirements on hardware such as memory capacity and battery endurance in a mobile environment due to large calculation amount, and is difficult to deal with large data volume processing in the mobile internet environment.
Cloud computing is a new IT resource providing mode, and depends on powerful distributed computing power, so that thousands of terminal users can implement various applications depending on the computing power of a network-connected hardware platform. Hadoop is a distributed system infrastructure, developed by the Apache Foundation. Users can build the distributed computing platform without knowing the distributed bottom level details. The Hadoop core components have two: hadoop Distributed File System (HDFS) and MapReduce. The HDFS is a distributed file system which hides details of lower-layer load balancing, redundant copying and the like and is suitable for being deployed on cheap machines. It can provide high throughput data access, is well suited for application on large-scale data sets, and provides a uniform file system API (application program interface) to upper layer programs. HDFS has only one name node, which is responsible for managing metadata operations and controlling the placement of data blocks, which are actually stored by the data nodes. In addition, MapReduce represents both map and reduce operations. Most distributed operations can be abstracted as MapReduce operations. map is the decomposition of the input into intermediate key/value pairs, reduce the synthesis of key/value into the final output. These two functions are provided to the system by the programmer, and the underlying facility runs map and reduce operations distributed across the cluster and stores the results on the distributed file system. And the user submits a MapReduce task to the main node, and the JobTracker is responsible for distributing the task to each child node to realize parallel processing.
Disclosure of Invention
The invention provides a face recognition method under a big data environment, aiming at solving the problems of low face recognition processing efficiency and poor accuracy under the big data environment in the prior art.
The invention provides a face recognition method under a big data environment, which comprises the following steps:
step S100, collecting original face images
The image capturing module is responsible for submitting a capturing task of a user to an image capturing platform, connecting to a main node of a cloud computing platform through an SSH (secure Shell) protocol, and capturing a required original human face image set from the Internet;
step S200, extracting training set images
The training set extraction module is responsible for carrying out subject clustering analysis based on a pLSA model on the original face image set and selecting a training set image in a user interaction mode;
step S300, classifier training
The classifier learning module is used for training a classifier according to a training set image provided by a user to obtain a classifier for image labeling;
step S400, image annotation
The classification labeling module completes the task of classifying and labeling the input face image or the face image sequence by using the classifier obtained in the step S300;
step S500, face recognition
The face recognition comprises inputting a face image with labels, and detecting images with similarity greater than a given threshold value with the input image on the Internet and/or in a local face image database.
Wherein, step S500 includes:
the input face image and the image in the local face image database are both subjected to image annotation in step S400.
Wherein, step S400 includes:
the classification labeling module comprises 3 submodules, namely a parameter updating module, an SVM classification module and a category generating labeling module; the user sets the parameters of the tasks through the updating parameter setting module, the SVM classification module classifies and labels the face images or the face image sequences according to the parameter setting of the user, and a class labeling file is generated through the class labeling generation module.
Wherein, step S500 includes:
the face recognition comprises: respectively comparing the similarity of the label and the content of the input image with the label and the content of the image on the Internet and/or in a local human face image database, and calculating according to the following formula:
D=αB+βN,
b is the similarity between image labels, N is the similarity between image contents, alpha and beta are respectively the weight occupied by the image labels and the image contents, and D is the similarity between the images obtained after the similarity between the image labels and the image contents is comprehensively considered; detecting an image with D larger than a given threshold value; and taking the image with the D larger than a given threshold value as a final face recognition result.
Wherein, the similarity comparison of the content of the input image and the content of the image in the internet and/or the local face image database comprises the following steps:
and for the input image, obtaining the global features of the image through the gray values and the texture features of the image.
The face recognition method in the big data environment further comprises the following steps:
the obtained global features are further converted into a binary signal and a residual signal.
Comparing the similarity of the content of the input image with the content of the images in the internet and/or the local facial image database further comprises:
finding candidate images by measuring the Hamming distance between binary characteristic signals, specifically, constructing a Hash table by the binary characteristic signals corresponding to the images, putting image numbers with different binary characteristic signals into different Hash buckets, giving a query image, finding out the Hash bucket with the Hamming distance of the binary characteristic signals of the query image being less than or equal to 2 by Hash operation, further comparing the images contained in the Hash buckets, and obtaining an image search result as the candidate images.
Comparing the similarity of the content of the input image with the content of the images in the internet and/or the local facial image database further comprises:
and taking out the corresponding image residual signals by using the image numbers stored in the hash bucket, reordering the candidate images by measuring Euclidean distance between the residual signals, and outputting the images which are ranked in front and have a residual distance with the query image smaller than a certain threshold value as a detection result of final image search.
The invention adopts a distributed cloud computing mode, can quickly and effectively acquire the face image data on the network, adopts different strategies during face recognition by considering different characteristics of image labels and image contents, and can effectively improve the efficiency and the accuracy of the face recognition.
Drawings
FIG. 1 is a flow chart of a face recognition method in a big data environment according to the present invention;
Detailed Description
The technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention. Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
Referring to fig. 1, the method for face recognition in big data environment provided by the invention includes
Step S100, collecting original face images
The image capturing module is responsible for submitting a capturing task of a user to an image capturing platform, is connected to a main node of a cloud computing platform through an SSH protocol, and captures a required original human face image set from the Internet.
The image grabber based on the existing text image search engine realizes an image grabbing platform based on cloud computing, and achieves the effect of efficiently and quickly grabbing the original face image.
Step S200, extracting training set images
The training set extraction module is responsible for carrying out subject clustering analysis based on a pLSA model on the original face image set and selecting a training set image in a user interaction mode;
the training set extraction module is divided into 3 submodules: the system comprises a parameter updating module, a pLSA clustering module and a training set selecting module. The user sets the parameters of the training set extraction task through the updating parameter module, the pLSA clustering module carries out topic clustering analysis on the original face image according to the parameter setting of the user, and the user can select the required training set image through the selection training set module after the topic clustering analysis is finished.
Step S300, classifier training
The classifier learning module is used for training a classifier according to a training set image provided by a user to obtain a classifier for image labeling;
the classifier used in the invention adopts a Support Vector Machine (SVM).
The classifier learning module is further divided into 3 submodules of updating parameters, SVM learning and classifier updating module. The user sets parameters of a classifier learning task through the parameter updating module, the SVM learning module learns the classifier model from the training set according to the parameter setting of the user, and the classifier updating module is responsible for storing and updating the classifier model after the operation is successful.
Other classifiers may also be used in the classifier of the present invention.
Image I obtained in step S200iAre all stored in an image database I ═ { I ═ I1,I2,…,INIn the description, the image content is described by using F as the visual characteristic of the image content, wherein I is used as a training image sample set, and N is the number of training image samples1,F2,…,FNDenotes a semantic keyword omega for labeling a pictureiForm semantic vocabulary W ═ ω12,…,ωM}. Given an unlabelled image I, the goal of automatic labeling of semantic images is to extract the optimal keyword set W*To describe the content of the image. Different from a generation model which establishes a connection between visual features and semantic concepts by estimating joint probability distribution between the visual features and the semantic concepts, the annotation model regards image annotations as a classification problem of multiple classes, each annotation word in a semantic vocabulary defines a semantic class, and then, assuming that a visual feature vector of an image I to be annotated is X, the image annotations can be expressed as:
<math> <mrow> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>|</mo> <mi>X</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>P</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>|</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> <mrow> <mi>P</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> </mrow> </math>
wherein, P (ω)i) The prior probability of the ith label word can be regarded as uniform distribution; p (X | ω)i) The class conditional probability density of the ith semantic class can be simulated by a normal density function of a multidimensional variable. At this time, the normally distributed bayesian classifier discriminant function can be expressed as:
<math> <mfenced open = '' close = ''> <mtable> <mtr> <mtd> <mrow> <msub> <mi>h</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>P</mi> <mrow> <mo>(</mo> <mi>X</mi> <mo>|</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <msup> <mrow> <mo>(</mo> <mn>2</mn> <mi>&pi;</mi> <mo>)</mo> </mrow> <mrow> <mi>n</mi> <mo>/</mo> <mn>2</mn> </mrow> </msup> <msup> <mrow> <mo>|</mo> <msub> <mi>S</mi> <mi>i</mi> </msub> <mo>|</mo> </mrow> <mrow> <mn>1</mn> <mo>/</mo> <mn>2</mn> </mrow> </msup> </mrow> </mfrac> <mi>exp</mi> <mo>&lsqb;</mo> <mo>-</mo> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mrow> <mo>(</mo> <mi>X</mi> <mo>-</mo> <mover> <msup> <mi>X</mi> <mrow> <mo>(</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </msup> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <msubsup> <mi>S</mi> <mi>i</mi> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mrow> <mo>(</mo> <mi>X</mi> <mo>-</mo> <mover> <msup> <mi>X</mi> <mrow> <mo>(</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> </msup> <mo>&OverBar;</mo> </mover> <mo>&rsqb;</mo> <mi>P</mi> <mo>(</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <mo>,</mo> </mrow> </mtd> </mtr> </mtable> </mfenced> </math>
wherein,is omegaiMean vector of classes. Considering each semantic concept as independent from each other, for the test image I, it is best labeled as
<math> <mrow> <msubsup> <mi>&omega;</mi> <mi>i</mi> <mo>*</mo> </msubsup> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>max</mi> </mrow> <mi>i</mi> </munder> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>&omega;</mi> <mi>i</mi> </msub> <mo>|</mo> <mi>X</mi> <mo>)</mo> </mrow> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>max</mi> </mrow> <mi>i</mi> </munder> <msub> <mi>h</mi> <mi>i</mi> </msub> <mrow> <mo>(</mo> <mi>X</mi> <mo>)</mo> </mrow> </mrow> </math>
The semantic keyword of an image may be more than one, and thus, may pass through P (ω)iI X) or hi(X) to select a plurality of annotation words.
Step S400, image annotation
And the classification and annotation module completes the task of classifying and annotating the input face image or the face image sequence by using the classifier obtained in the step S300.
The classification labeling module is divided into 3 submodules of updating parameters, SVM classification and generating class labeling. The user sets the parameters of the tasks through the updating parameter setting module, the SVM classification module classifies and labels the face images or the face image sequences according to the parameter setting of the user, and a class labeling file is generated through the class labeling generation module.
The classification label is to satisfy 4 functional requirements: capturing an original face image, extracting a training set, learning a classifier model and carrying out classification and labeling. A user can generate a training set from an original data set through the extraction of the training set, then learn a classifier model, classify and label images by using the classifier model, and store classification and labeling results for the user to inquire or retrieve a system.
The invention divides the system into an image capture module, a training set extraction module, a classifier learning module and a classification marking module by combining the architecture of the system. The image capturing module is responsible for submitting a capturing task of a user to an image capturing platform, connecting to a main node of a cloud computing platform through an SSH (secure Shell) protocol, and capturing a required original human face image set from the Internet; the training set extraction module is responsible for carrying out subject clustering analysis based on a pLSA model on the original face image set and selecting a training set image in a user interaction mode; the classifier learning module is used for learning a classifier model according to a training set image provided by a user and storing the classifier model as a classifier model file; and the classification labeling module completes the task of performing classification labeling on the images or the image sequences and generates a classification labeling file.
In the invention, firstly, the computing power of a plurality of pieces of hardware in the internet is exerted by utilizing the cloud technology, the image capturing speed is accelerated, and the required original human face image set is obtained. Secondly, after enough original data sets are captured, a training set extraction module is used for helping a user select a proper training set in an interactive mode. Third, the classifier is trained by a classifier learning module. And finally, the classification and labeling module classifies and labels the new images by using the classifiers.
Step S500, face recognition
The face recognition comprises inputting a face image with labels, and detecting images with similarity greater than a given threshold value with the input image on the Internet and/or in a local face image database.
The face recognition comprises: firstly, similarity comparison is carried out on the labels of the input images and the labels of the images in the Internet and/or a local human face image database, and the images with the similarity larger than a first threshold value are detected; then, comparing the similarity of the content of the input image with the content of the image with the similarity larger than a first threshold value, and detecting the image with the similarity larger than a second threshold value; and taking the image with the similarity larger than the second threshold value as a final face recognition result.
In the above embodiment of the present invention, the image labels are first used for detection and comparison, so that the image sets semantically related to the input image labels can be detected, and then the image contents are compared to detect the images with similar contents. Because the detection based on the image annotation is faster, and the detection based on the image content is much more time-consuming, the embodiment can improve the accuracy of face recognition and increase the detection speed.
Or,
the face recognition comprises: firstly, comparing the similarity of the content of the input image with the content of the image in the Internet and/or a local human face image database, and detecting the image with the similarity larger than a first threshold value; then, similarity comparison is carried out on the label of the input image and the label of the image with the similarity larger than a first threshold value, and the image with the similarity larger than a second threshold value is detected; and taking the image with the similarity larger than the second threshold value as a final face recognition result.
In the above embodiment of the present invention, the content of the image is used for detection and comparison, so that the image set related to the content of the input image can be detected first, and then the comparison of the image labels is performed to detect the images with similar labels. Since images having similar contents sometimes have to be expressed with dissimilar meanings, this embodiment can detect images having both contents and semantically related to the input image.
Or,
the face recognition comprises: respectively comparing the similarity of the label and the content of the input image with the label and the content of the image on the Internet and/or in a local human face image database, and calculating according to the following formula:
D=αB+βN,
b is the similarity between image labels, N is the similarity between image contents, alpha and beta are respectively the weight occupied by the image labels and the image contents, and D is the similarity between the images obtained after the similarity between the image labels and the image contents is comprehensively considered; detecting an image with D larger than a given threshold value; and taking the image with the D larger than a given threshold value as a final face recognition result.
In the embodiment of the invention, the influence of image annotation and image content on the face recognition result is considered at the same time, so that a relatively more objective detection result can be obtained.
The input face image and the image in the local face image database are both image labels performed in step S400, and certainly, no special limitation may be performed on the image labels.
For the comparison of the image contents, for the input one face image, the face image is firstly uniformly divided into 8 × 8 area blocks, and then the median of the gray values of the pixels included in each area is calculated, so as to generate a 64-dimensional gray value vector. In order to better represent visual content of an image, besides color features, texture features are used as an important supplement; when extracting texture features, firstly dividing an image into 2x2 area blocks, then counting a gradient histogram in each area, quantizing the gradient direction by using 12Bins in the counting process, and recording the proportion of pixels with gradient values of zero by using a feature dimension, and finally forming a global feature with 8x8+2x2x (12+1) ═ 116 dimensions.
Linear scanning in a feature space composed of super-large-scale global features is still a very time-consuming operation process, and in order to further improve the efficiency of retrieval and matching, a simple and effective hash algorithm is utilized to further convert the original features into a binary signal and a residual signal. Specifically, principal component analysis is performed on the original features by using PCA, feature dimensions with variance larger than a given threshold are reserved, and binary quantization operation is performed on the original features by using a median of each dimension of the features reserved in the feature space (the quantization with feature value larger than the median is 1, otherwise, the quantization is 0) so as to obtain corresponding binary signals. In addition to preserving these binary signals, we also preserve the first 24-dimensional features after PCA processing as residual signals for further similarity comparison in order to improve matching accuracy.
The face recognition search by using the extracted characteristic signal needs to be carried out by the following two steps of 1) Hash search based on binary signals; 2) reordering based on the residual signal.
Hash search, because the generation process of the characteristic signals uses linear transformation corresponding to PCA, the distances between the characteristic signals corresponding to the images are similar, and the candidate images can be effectively found by measuring the Hamming distance between the binary characteristic signals. Specifically, a hash table can be conveniently constructed through binary characteristic signals corresponding to the images, and the image numbers with different characteristic signals are put into different hash buckets. Thus, given a query image, hash buckets having a Hamming distance of 2 or less from the query image feature signal can be quickly found by hashing operations, and for a 16-dimensional binary signal, such hash buckets have at mostTherefore, the final image searching result can be obtained only by further comparing the images contained in the 137 hash buckets, and unnecessary operation and calculation are effectively avoided.
The hash search is to find candidate images by measuring the hamming distance between the binary characteristic signals, specifically, construct a hash table by the binary characteristic signals corresponding to the images, put the image numbers with different binary characteristic signals into different hash buckets, give a query image, find out the hash bucket with the hamming distance less than or equal to 2 from the binary characteristic signals of the query image by hash operation, further compare the images contained in the hash bucket, and obtain the image search result as the candidate images.
And residual error sorting, namely, in the process of using residual error information to reorder, using the image number stored in the hash bucket to take out the image residual error signal corresponding to the image number, and reordering the candidate images by measuring the Euclidean distance between the residual error signals. And outputting the image which is ranked at the top and has a residual distance with the query image smaller than a certain threshold value as a detection result of the final image search.
And the residual sorting is that the image residual signals corresponding to the image numbers stored in the hash bucket are taken out by utilizing the image numbers, the candidate images are reordered by measuring the Euclidean distance between the residual signals, and the images which are ranked at the front and have the residual distance with the query image smaller than a certain threshold value are output as the detection result of the final image search.
Step S600, image output
And (5) sorting the detection result images in the step (S500) from big to small according to the similarity and outputting the images to the user.
The invention adopts a distributed cloud computing mode, can quickly and effectively acquire image data on a network, adopts different strategies during face recognition by considering different characteristics of image labels and image contents, and can effectively improve the efficiency and accuracy of face recognition.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (8)

1. A face recognition method in a big data environment comprises the following steps:
step S100, collecting original face images
The image capturing module is responsible for submitting a capturing task of a user to an image capturing platform, connecting to a main node of a cloud computing platform through an SSH (secure Shell) protocol, and capturing a required original human face image set from the Internet;
step S200, extracting training set images
The training set extraction module is responsible for carrying out subject clustering analysis based on a pLSA model on the original face image set and selecting a training set image in a user interaction mode;
step S300, classifier training
The classifier learning module is used for training a classifier according to a training set image provided by a user to obtain a classifier for image labeling;
step S400, image annotation
The classification labeling module completes the task of classifying and labeling the input face image or the face image sequence by using the classifier obtained in the step S300;
step S500, face recognition
The face recognition comprises inputting a face image with labels, and detecting images with similarity greater than a given threshold value with the input image on the Internet and/or in a local face image database.
2. The face recognition method in big data environment as claimed in claim 1, wherein step S500 comprises:
the input face image and the image in the local face image database are both subjected to image annotation in step S400.
3. The face recognition method in big data environment as claimed in claim 2, wherein step S400 comprises:
the classification labeling module comprises 3 submodules, namely a parameter updating module, an SVM classification module and a category generating labeling module; the user sets the parameters of the tasks through the updating parameter setting module, the SVM classification module classifies and labels the face images or the face image sequences according to the parameter setting of the user, and a class labeling file is generated through the class labeling generation module.
4. The face recognition method in big data environment as claimed in claim 1, wherein step S500 comprises:
the face recognition comprises: respectively comparing the similarity of the label and the content of the input image with the label and the content of the image on the Internet and/or in a local human face image database, and calculating according to the following formula:
D=αB+βN,
b is the similarity between image labels, N is the similarity between image contents, alpha and beta are respectively the weight occupied by the image labels and the image contents, and D is the similarity between the images obtained after the similarity between the image labels and the image contents is comprehensively considered; detecting an image with D larger than a given threshold value; and taking the image with the D larger than a given threshold value as a final face recognition result.
5. The face recognition method in big data environment as claimed in claim 4, wherein comparing the similarity of the content of the input image with the content of the image in the internet and/or the local face image database comprises:
and for the input image, obtaining the global features of the image through the gray values and the texture features of the image.
6. The face recognition method in a big data environment as claimed in claim 5, further comprising:
the obtained global features are further converted into a binary signal and a residual signal.
7. The method of face recognition in big data environment as claimed in claim 6, comparing similarity of content of input image with content of image in internet and/or local face image database further comprises:
finding candidate images by measuring the Hamming distance between binary characteristic signals, specifically, constructing a Hash table by the binary characteristic signals corresponding to the images, putting image numbers with different binary characteristic signals into different Hash buckets, giving a query image, finding out the Hash bucket with the Hamming distance of the binary characteristic signals of the query image being less than or equal to 2 by Hash operation, further comparing the images contained in the Hash buckets, and obtaining an image search result as the candidate images.
8. The method of face recognition in big data environment as claimed in claim 7, comparing similarity of content of input image with content of image in internet and/or local face image database further comprises:
and taking out the corresponding image residual signals by using the image numbers stored in the hash bucket, reordering the candidate images by measuring Euclidean distance between the residual signals, and outputting the images which are ranked in front and have a residual distance with the query image smaller than a certain threshold value as a detection result of final image search.
CN201510507364.3A 2015-08-18 2015-08-18 Face identification method under big data environment Pending CN105138977A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510507364.3A CN105138977A (en) 2015-08-18 2015-08-18 Face identification method under big data environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510507364.3A CN105138977A (en) 2015-08-18 2015-08-18 Face identification method under big data environment

Publications (1)

Publication Number Publication Date
CN105138977A true CN105138977A (en) 2015-12-09

Family

ID=54724321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510507364.3A Pending CN105138977A (en) 2015-08-18 2015-08-18 Face identification method under big data environment

Country Status (1)

Country Link
CN (1) CN105138977A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294762A (en) * 2016-08-11 2017-01-04 齐鲁工业大学 A kind of entity recognition method based on study
CN107545306A (en) * 2017-07-05 2018-01-05 安徽奇智科技有限公司 A kind of big data analysis platform based on cloud computing
CN107633304A (en) * 2017-07-28 2018-01-26 中国电子科技集团公司第四十八研究所 A kind of learning method of sleeping position monitoring
CN108229772A (en) * 2016-12-14 2018-06-29 北京国双科技有限公司 Mark processing method and processing device
CN108319938A (en) * 2017-12-31 2018-07-24 奥瞳系统科技有限公司 High quality training data preparation system for high-performance face identification system
CN108805212A (en) * 2018-06-14 2018-11-13 新联智慧信息技术(深圳)有限公司 The processing method and Related product of big data
CN108860150A (en) * 2018-07-03 2018-11-23 百度在线网络技术(北京)有限公司 Automobile brake method, apparatus, equipment and computer readable storage medium
CN109325397A (en) * 2018-06-14 2019-02-12 新联智慧信息技术(深圳)有限公司 Method and Related product based on AI Intelligent treatment big data
CN110825808A (en) * 2019-09-23 2020-02-21 重庆特斯联智慧科技股份有限公司 Distributed human face database system based on edge calculation and generation method thereof
CN111488825A (en) * 2020-04-09 2020-08-04 贵州爱信诺航天信息有限公司 Face recognition and identity card verification system based on Internet of things and big data technology
CN113611291A (en) * 2020-08-12 2021-11-05 广东电网有限责任公司 Speech recognition algorithm for electric power major

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2402535A (en) * 2003-06-05 2004-12-08 Canon Kk Face recognition
CN103902704A (en) * 2014-03-31 2014-07-02 华中科技大学 Multi-dimensional inverted index and quick retrieval algorithm for large-scale image visual features

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2402535A (en) * 2003-06-05 2004-12-08 Canon Kk Face recognition
CN103902704A (en) * 2014-03-31 2014-07-02 华中科技大学 Multi-dimensional inverted index and quick retrieval algorithm for large-scale image visual features

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
付海燕: "基于图像哈希的大规模图像检索方法研究", 《中国博士学位论文全文数据库》 *
刘康苗等: "基于视觉和语义融合特征的阶段式图像聚类", 《浙江大学学报(工学版)》 *
陆寄远等: "基于Hadoop云计算平台的图像分类与标注", 《电信科学》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294762A (en) * 2016-08-11 2017-01-04 齐鲁工业大学 A kind of entity recognition method based on study
CN106294762B (en) * 2016-08-11 2019-12-10 齐鲁工业大学 Entity identification method based on learning
CN108229772A (en) * 2016-12-14 2018-06-29 北京国双科技有限公司 Mark processing method and processing device
CN107545306A (en) * 2017-07-05 2018-01-05 安徽奇智科技有限公司 A kind of big data analysis platform based on cloud computing
CN107633304B (en) * 2017-07-28 2020-12-11 中国电子科技集团公司第四十八研究所 Learning method for sleeping posture monitoring
CN107633304A (en) * 2017-07-28 2018-01-26 中国电子科技集团公司第四十八研究所 A kind of learning method of sleeping position monitoring
CN108319938A (en) * 2017-12-31 2018-07-24 奥瞳系统科技有限公司 High quality training data preparation system for high-performance face identification system
CN108319938B (en) * 2017-12-31 2022-05-17 奥瞳系统科技有限公司 High-quality training data preparation system for high-performance face recognition system
CN108805212A (en) * 2018-06-14 2018-11-13 新联智慧信息技术(深圳)有限公司 The processing method and Related product of big data
CN109325397A (en) * 2018-06-14 2019-02-12 新联智慧信息技术(深圳)有限公司 Method and Related product based on AI Intelligent treatment big data
CN108860150A (en) * 2018-07-03 2018-11-23 百度在线网络技术(北京)有限公司 Automobile brake method, apparatus, equipment and computer readable storage medium
CN110825808A (en) * 2019-09-23 2020-02-21 重庆特斯联智慧科技股份有限公司 Distributed human face database system based on edge calculation and generation method thereof
CN111488825A (en) * 2020-04-09 2020-08-04 贵州爱信诺航天信息有限公司 Face recognition and identity card verification system based on Internet of things and big data technology
CN113611291A (en) * 2020-08-12 2021-11-05 广东电网有限责任公司 Speech recognition algorithm for electric power major

Similar Documents

Publication Publication Date Title
CN105138977A (en) Face identification method under big data environment
TWI753034B (en) Method, device and electronic device for generating and searching feature vector
Zhang et al. Finding celebrities in billions of web images
CN102414680B (en) Utilize the semantic event detection of cross-domain knowledge
CN102902821B (en) The image high-level semantics mark of much-talked-about topic Network Based, search method and device
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
Wang et al. Deep cascaded cross-modal correlation learning for fine-grained sketch-based image retrieval
Abdul-Rashid et al. Shrec’18 track: 2d image-based 3d scene retrieval
Tian et al. Image classification based on the combination of text features and visual features
CN114372532B (en) Method, device, equipment, medium and product for determining label labeling quality
CN105069136A (en) Image recognition method in big data environment
CN105117735A (en) Image detection method in big data environment
CN113704623B (en) Data recommendation method, device, equipment and storage medium
Manisha et al. Content-based image retrieval through semantic image segmentation
Yang et al. Large scale video data analysis based on spark
Lv et al. Retrieval oriented deep feature learning with complementary supervision mining
Perdana et al. Instance-based deep transfer learning on cross-domain image captioning
Tian et al. Automatic image annotation with real-world community contributed data set
Wang et al. Listen, look, and find the one: Robust person search with multimodality index
Mahalakshmi et al. Collaborative text and image based information retrieval model using bilstm and residual networks
JP2014102772A (en) Program, device, and method for calculating similarity between contents represented by sets of feature vectors
Dornaika et al. Image-based face beauty analysis via graph-based semi-supervised learning
Zhou et al. RFSEN-ELM: SELECTIVE ENSEMBLE OF EXTREME LEARNING MACHINES USING ROTATION FOREST FOR IMAGE CLASSIFICATION.
Xie et al. Analyzing semantic correlation for cross-modal retrieval
Kabbur An Efficient Multiclass Medical Image CBIR System Based on Classification and Clustering

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20151209