CN114332993A - Face recognition method and device, electronic equipment and computer readable storage medium - Google Patents

Face recognition method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN114332993A
CN114332993A CN202111552557.2A CN202111552557A CN114332993A CN 114332993 A CN114332993 A CN 114332993A CN 202111552557 A CN202111552557 A CN 202111552557A CN 114332993 A CN114332993 A CN 114332993A
Authority
CN
China
Prior art keywords
illumination
image
characteristic
stage
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111552557.2A
Other languages
Chinese (zh)
Inventor
黄泽元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xumi Yuntu Space Technology Co Ltd
Original Assignee
Shenzhen Jizhi Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jizhi Digital Technology Co Ltd filed Critical Shenzhen Jizhi Digital Technology Co Ltd
Priority to CN202111552557.2A priority Critical patent/CN114332993A/en
Publication of CN114332993A publication Critical patent/CN114332993A/en
Pending legal-status Critical Current

Links

Images

Abstract

The disclosure relates to the technical field of computer vision, and provides a face recognition method, a face recognition device, electronic equipment and a computer-readable storage medium. The method comprises the following steps: acquiring an image to be processed; performing feature extraction on the image to be processed to obtain a feature image of the image to be processed; performing characteristic transformation on the characteristic image to obtain a space illumination semantic graph corresponding to the image to be processed; performing feature transformation on the feature image based on the space illumination semantic graph to obtain a channel illumination semantic graph corresponding to the image to be processed; according to the space illumination semantic graph and the channel illumination semantic graph, performing illumination separation on the image to be processed to obtain non-illumination characteristics of the image to be processed; and carrying out face recognition based on the non-illumination characteristics to obtain a face recognition result. The method separates the non-illumination characteristics without the illumination characteristics from the image to be processed for face recognition, can reduce the interference of the illumination characteristics in the image on the face recognition as much as possible, obtains a better face recognition result, and improves the face recognition accuracy.

Description

Face recognition method and device, electronic equipment and computer readable storage medium
Technical Field
The present disclosure relates to the field of computer vision technologies, and in particular, to a face recognition method and apparatus, an electronic device, and a computer-readable storage medium.
Background
Face recognition is a biometric technique for identifying an identity based on facial feature information of a person. A series of related technologies, also commonly called face recognition and face recognition, are used to collect images or video streams containing faces by using a camera or a video camera, automatically detect and track the faces in the images, and then perform face recognition on the detected faces. With the development of computer vision technology, face recognition technology is more and more mature.
The face recognition is sensitive to the illumination condition of the collected image or video stream containing the face, a better face recognition effect can be obtained only by requiring a better illumination condition, and a face recognition error is easily caused once adverse conditions such as exposure, backlight and the like occur.
Disclosure of Invention
In view of this, embodiments of the present disclosure provide a face recognition method, an apparatus, an electronic device, and a computer-readable storage medium, so as to solve the problem in the prior art that different illumination conditions have a large influence on a face recognition result.
In a first aspect of the embodiments of the present disclosure, a face recognition method is provided, including:
acquiring an image to be processed;
performing feature extraction on the image to be processed to obtain a feature image of the image to be processed;
performing characteristic transformation on the characteristic image to obtain a space illumination semantic graph corresponding to the image to be processed;
performing feature transformation on the feature image based on the space illumination semantic graph to obtain a channel illumination semantic graph corresponding to the image to be processed;
according to the space illumination semantic graph and the channel illumination semantic graph, performing illumination separation on the image to be processed to obtain non-illumination characteristics of the image to be processed;
and carrying out face recognition based on the non-illumination characteristics to obtain a face recognition result.
In a second aspect of the embodiments of the present disclosure, a face recognition apparatus is provided, including:
an acquisition module configured to acquire an image to be processed;
the characteristic extraction module is configured to extract the characteristics of the image to be processed to obtain a characteristic image of the image to be processed;
the first feature transformation module is configured to perform feature transformation on the feature image to obtain a spatial illumination semantic graph corresponding to the image to be processed;
the second feature transformation module is configured to perform feature transformation on the feature image based on the spatial illumination semantic graph to obtain a channel illumination semantic graph corresponding to the image to be processed;
the illumination separation module is configured to perform illumination separation on the image to be processed according to the space illumination semantic graph and the channel illumination semantic graph to obtain non-illumination characteristics of the image to be processed;
and the face recognition module is configured to perform face recognition based on the non-illumination characteristics to obtain a face recognition result.
In a third aspect of the embodiments of the present disclosure, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the above method when executing the computer program.
In a fourth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor, implements the steps of the above-mentioned method.
Compared with the prior art, the embodiment of the disclosure has the following beneficial effects: the method comprises the steps of obtaining a space illumination semantic graph and a channel illumination semantic graph of an image to be processed by extracting and converting features of the image to be processed, carrying out illumination separation according to the space illumination semantic graph and the channel illumination semantic graph to obtain non-illumination features in the image to be processed, and finally carrying out face recognition on the non-illumination features to obtain a face recognition result. The method comprises the steps of calculating a space illumination semantic graph and a channel illumination semantic graph, utilizing the two illumination semantic graphs to perform illumination separation, separating non-illumination characteristics without illumination characteristics from an image to be processed to perform face recognition, and reducing interference of the illumination characteristics in the image on the face recognition as much as possible, thereby obtaining a better face recognition result and improving the accuracy of the face recognition.
Drawings
To more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.
FIG. 1 is a scenario diagram of an application scenario of an embodiment of the present disclosure;
fig. 2 is a schematic flow chart of a face recognition method according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of a process of performing illumination separation on an image to be processed according to a spatial illumination semantic graph and a channel illumination semantic graph according to the embodiment of the present disclosure;
fig. 4 is another schematic flow chart illustrating illumination separation of an image to be processed according to a spatial illumination semantic graph and a channel illumination semantic graph according to the embodiment of the present disclosure;
FIG. 5 is a schematic flow chart of a training process via a network according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a face recognition apparatus according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the disclosed embodiments. However, it will be apparent to one skilled in the art that the present disclosure may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present disclosure with unnecessary detail.
A face recognition method and apparatus according to an embodiment of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 is a scene schematic diagram of an application scenario of an embodiment of the present disclosure. The application scenario may include terminal devices 1, 2, and 3, server 4, and network 5.
The terminal devices 1, 2, and 3 may be hardware or software. When the terminal devices 1, 2 and 3 are hardware, they may be various electronic devices having a display screen and supporting communication with the server 4, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like; when the terminal devices 1, 2, and 3 are software, they may be installed in the electronic devices as above. The terminal devices 1, 2 and 3 may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module, which is not limited by the embodiments of the present disclosure. Further, the terminal devices 1, 2, and 3 may have various applications installed thereon, such as a data processing application, an instant messaging tool, social platform software, a search-type application, a shopping-type application, and the like.
The server 4 may be a server providing various services, for example, a backend server receiving a request sent by a terminal device establishing a communication connection with the server, and the backend server may receive and analyze the request sent by the terminal device and generate a processing result. The server 4 may be one server, may also be a server cluster composed of a plurality of servers, or may also be a cloud computing service center, which is not limited in this disclosure.
The server 4 may be hardware or software. When the server 4 is hardware, it may be various electronic devices that provide various services to the terminal devices 1, 2, and 3. When the server 4 is software, it may be a plurality of software or software modules providing various services for the terminal devices 1, 2, and 3, or may be a single software or software module providing various services for the terminal devices 1, 2, and 3, which is not limited by the embodiment of the present disclosure.
The network 5 may be a wired network connected by a coaxial cable, a twisted pair and an optical fiber, or may be a wireless network that can interconnect various Communication devices without wiring, for example, Bluetooth (Bluetooth), Near Field Communication (NFC), Infrared (Infrared), and the like, which is not limited in the embodiment of the present disclosure.
A user can establish a communication connection with the server 4 via the network 5 through the terminal devices 1, 2, and 3 to receive or transmit information or the like. Specifically, after a user imports an image to be processed into the server 4, the server 4 performs feature extraction on the image to be processed to obtain a feature image, performs feature transformation on the feature image to respectively obtain a spatial illumination semantic graph and a spatial illumination semantic graph of the image to be processed, performs illumination separation by combining the spatial illumination semantic graph and the spatial illumination semantic graph to obtain a non-illumination feature of the image to be processed, and performs face recognition on the non-illumination feature to obtain a face recognition result.
It should be noted that the specific types, numbers and combinations of the terminal devices 1, 2 and 3, the server 4 and the network 5 may be adjusted according to the actual requirements of the application scenarios, and the embodiment of the present disclosure does not limit this.
Fig. 2 is a schematic flow chart of a face recognition method according to an embodiment of the present disclosure. The face recognition method of fig. 2 may be performed by the server of fig. 1. As shown in fig. 2, the face recognition method includes:
s201, acquiring an image to be processed.
The image to be processed is an image which needs face recognition, and understandably, the image to be processed includes a face. The image to be processed can be acquired in any mode.
S202, performing feature extraction on the image to be processed to obtain a feature image of the image to be processed.
Feature extraction a method for extracting desired features by image analysis and transformation. A feature is a corresponding (essential) characteristic or property, or a collection of characteristics or properties, that distinguishes one class of objects from another. A feature is data that can be extracted by measurement or processing. For images, each image has self characteristics which can be distinguished from other images, and some images are natural characteristics which can be intuitively felt, such as brightness, edges, textures, colors and the like; some of them are obtained by transformation or processing, such as moment, histogram, principal component, etc. In this embodiment, a feature obtained by extracting a feature of an image to be processed is recorded as a feature image.
The feature extraction of the image to be processed can be realized in any mode. In some embodiments, a characteristic image may be obtained by performing feature extraction on an input image to be processed through a neural network, such as a residual module; in a particular embodiment, the residual network comprises two down-sampling layers; furthermore, two residual modules are adopted to extract the features of the image to be processed, the first residual module comprises a down-sampling layer with the step size of 2 and the channel number of 64, and the second residual module comprises 2 down-sampling layers with the step size of 1 and the channel number of 64. In some embodiments, the size of the image to be processed is 112 × 112, and then the dimension of the feature image obtained by feature extraction performed by the above-described feature extraction module is (56,56, 64).
Wherein, the downsampling principle is as follows: for an image I with the size of M × N, s-time down-sampling is carried out on the image I to obtain a resolution image with the size of (M/s) × (N/s), wherein s is a common divisor of M and N, and for the image in a matrix form, the image in a window of s × s of the original image is changed into a pixel, and the value of the pixel is the average value of all pixels in the window.
In other embodiments, other ways may be adopted to perform feature extraction on the image to be processed to obtain the feature image.
And S203, performing characteristic transformation on the characteristic image to obtain a space illumination semantic graph corresponding to the image to be processed.
The feature transformation is a process of further processing and transforming the extracted feature image to obtain new image features. The process of performing feature transformation on the feature image to obtain the spatial illumination semantic graph of the image to be processed will be described in detail in the following embodiments.
And S204, performing feature transformation on the feature image based on the space illumination semantic graph to obtain a channel illumination semantic graph corresponding to the image to be processed.
After the spatial illumination semantic graph is obtained, the feature image is subjected to feature transformation by combining the spatial illumination semantic graph to obtain a channel illumination semantic graph of the image to be processed, which will be described in detail in the embodiment later.
In some embodiments, the spatial illumination semantic graph and the channel illumination semantic graph respectively represent illumination features in the image to be processed, which are extracted in different ways.
S205, according to the space illumination semantic graph and the channel illumination semantic graph, illumination separation is carried out on the image to be processed, and the non-illumination characteristic of the image to be processed is obtained.
Illumination separation means distinguishing between illuminated and non-illuminated features in a feature image. It is to be understood that, in the present embodiment, the illumination feature represents an illumination-related feature in the image, the non-illumination feature represents an illumination-unrelated feature in the image, and the non-illumination feature does not change with illumination change.
In this embodiment, the image to be processed is separated by illumination to obtain the non-illumination features therein. In some embodiments, performing illumination separation on an image to be processed according to a spatial illumination semantic graph and a channel illumination semantic graph to obtain a non-illumination feature of the image to be processed, including: and performing illumination separation on the image to be processed according to the space illumination semantic graph and the channel illumination semantic graph to obtain illumination characteristics of the image to be processed, and then obtaining non-illumination characteristics according to the illumination characteristics of the image to be processed. Further, in some embodiments, deriving the non-illumination characteristic from the illumination characteristic of the image to be processed comprises: and determining the difference value between 1 and the illumination characteristic as the non-illumination characteristic of the image to be processed. The illumination feature of the image to be processed obtained by performing illumination separation on the image to be processed according to the spatial illumination semantic map and the channel illumination semantic map will be described in detail in the following embodiments.
And S206, carrying out face recognition based on the non-illumination characteristics to obtain a face recognition result.
Face recognition is a biometric technique for identifying an identity based on facial feature information of a person. And automatically detecting and tracking the face in the image or video stream containing the face, and further carrying out face identification on the detected face. The face recognition may be implemented in any manner, for example, the face recognition result may be obtained by performing face recognition on the non-illumination feature through a trained face recognition model. In the embodiment, the image to be processed is subjected to illumination separation, the non-illumination features which do not contain the illumination features are separated, the non-illumination features are subjected to face recognition, the features related to illumination in the image are removed, the influence of the illumination on the face recognition in the image to be processed is avoided, and therefore the accuracy of the face recognition is improved.
According to the technical scheme provided by the embodiment of the disclosure, a space illumination semantic graph and a channel illumination semantic graph of an image to be processed are obtained by performing feature extraction and feature transformation on the image to be processed, illumination separation is performed according to the space illumination semantic graph and the channel illumination semantic graph to obtain non-illumination features in the image to be processed, and finally face recognition is performed on the non-illumination features to obtain a face recognition result. The method comprises the steps of calculating a space illumination semantic graph and a channel illumination semantic graph, utilizing the two illumination semantic graphs to perform illumination separation, separating non-illumination characteristics without illumination characteristics from an image to be processed to perform face recognition, and reducing interference of the illumination characteristics in the image on the face recognition as much as possible, thereby obtaining a better face recognition result and improving the accuracy of the face recognition.
In some embodiments, according to the space illumination semantic graph and the channel illumination semantic graph, illumination separation is carried out on the image to be processed, so that the illumination characteristic of the image to be processed is obtained while the non-illumination characteristic of the image to be processed is obtained; in this embodiment, the face recognition method further includes: determining the illumination type of the image to be processed according to the illumination characteristics; and when the illumination type meets the preset illumination condition, performing face recognition based on the non-illumination characteristic to obtain a face recognition result.
In this embodiment, the characteristic image is subjected to illumination separation to obtain an illumination characteristic and a non-illumination characteristic of the image to be processed, and meanwhile, a corresponding illumination type is determined by using the illumination characteristic, and the illumination type is judged to determine whether face recognition is required.
Wherein the illumination category represents dividing illumination into different categories according to illumination conditions; in some embodiments, the lighting conditions may include a light source; in one embodiment, the illumination may be divided into different categories of natural light in sunny days, natural light in cloudy days, white light, yellow light, etc. according to the light source. In other embodiments, the lighting conditions include lighting angles; in one embodiment, the images may be divided into different categories of front light, side light, back light, etc. according to the illumination angle. In other embodiments, the illumination condition may further include an illumination intensity, and in a specific embodiment, the illumination of the image may be classified into a strong light, a normal light, a weak light, and the like according to the illumination intensity. In other embodiments, the illumination category further includes the exposure category (which refers to the amount of light entering the lens to illuminate the photosensitive element during the photographing process, and is controlled by a combination of aperture, shutter, and sensitivity).
Determining the illumination type of the image to be processed according to the illumination characteristics can be achieved in any manner. The preset illumination condition can be set according to the actual situation. In some embodiments, the preset illumination condition may be set to that the illumination intensity reaches a preset illumination intensity threshold, the illumination type is a front light, and the like, and the allowing of the face recognition is executed only when the illumination characteristic of the image to be processed determines that the illumination type of the image to be processed satisfies the above condition. In other embodiments, the preset lighting condition may be set to other conditions.
In other embodiments, when the illumination category does not meet the preset illumination condition, the face recognition of the image to be processed is stopped.
The face recognition task is generally applied to a scene of authentication, for example, face recognition login accounts, and face recognition payment, in the scene, the authentication is successful through the face recognition, then the account is successfully logged in, the payment is completed, and the like, and all the scenes may relate to privacy, payment safety and the like of a user, so that higher accuracy is required for the face recognition task to ensure the privacy and the safety of the user. The technical scheme provided by the embodiment of the disclosure is that when the non-illumination characteristic of the image to be processed is only used for face recognition, the illumination characteristic extracted from the image to be processed is used for recognizing the illumination type of the image to be processed, the face recognition task is limited by the illumination type, the face recognition task is allowed to be carried out only when the illumination type meets the preset illumination condition, the influence of the illumination condition on the face recognition by other people can be reduced, and the condition that the right of other people is violated by face recognition by counterfeiting or fake face images is generated.
In some embodiments, performing feature transformation on the feature image to obtain a spatial illumination semantic graph corresponding to the image to be processed includes: performing parallel multi-scale convolution calculation on the characteristic image to obtain space illumination characteristic images of different views; and fusing the space illumination characteristic images to obtain a space illumination semantic graph.
Convolution is a feature extraction method, and as understood from a functional point of view, the convolution process is a process of performing linear transformation on each position of an image and mapping the linear transformation to a new value. The feature fusion is to extract different feature vectors from the same mode to perform optimized combination, and has serial and parallel modes.
In some embodiments, performing parallel multi-scale convolution operations on the feature image comprises: and respectively and simultaneously carrying out convolution operation on the characteristic images of the images to be processed through convolution modules with different convolution kernels to obtain space illumination characteristic images with different visual fields. In a specific embodiment, convolution operation of 3 scales is performed on the feature image in parallel to obtain spatial illumination feature images of three fields of view.
Further, in some embodiments, each scale of the convolution operation performed on the feature image is performed by a calculation module. In one embodiment, the structure of each computing module comprises, in order: convolution kernel k, convolution layer with channel number 32, batch normalization, relu (Rectified Linear Unit) activation layer, and convolution kernel 1 × 1, convolution layer with channel number 1, batch normalization layer, and softmax layer (normalized exponential calculation). The convolution kernel k of the calculation module corresponding to each scale may be set to be different, and taking three calculation modules corresponding to three scales as an example, the convolution kernel k may be set to be 3,5, and 7. In other embodiments, the convolution kernel size of each computation module may be set to other values according to actual conditions.
In a specific embodiment, the calculating module may be represented as:
f(x;k)=softmax(bn(conv(relu(bn(conv(x;k)));1)))
wherein x represents an input characteristic image, k represents a convolution kernel of a calculation module where the input characteristic image is located, conv represents convolution operation, bn represents batch normalization operation, and relu represents activation operation.
The spatial illumination semantic graph may be represented as:
A1=softmax(conv([f(p0;3),f(p0;5),f(p0;7)];1))
wherein A1 represents a space illumination semantic graph, and p0 represents a characteristic image of an image to be processed.
Further, in some embodiments, the spatial illumination features corresponding to the respective views are superimposed to obtain a spatial illumination semantic map.
According to the technical scheme provided by the embodiment of the disclosure, the spatial illumination characteristics of different fields of view of the image to be processed are obtained by performing parallel multi-scale convolution calculation on the characteristic image, and the spatial illumination characteristics of each field of view are fused to obtain a spatial illumination semantic graph, so that the calculation is more accurate, and the spatial illumination characteristics in the image are better extracted.
In some embodiments, performing feature transformation on the feature image based on the spatial illumination semantic graph to obtain a channel illumination semantic graph corresponding to the image to be processed, includes: performing convolution on the characteristic image to obtain an intermediate characteristic image; performing parallel pooling and dimension transformation on the intermediate characteristic image by using the space illumination semantic graph as weight to obtain a channel illumination characteristic and a local channel illumination characteristic; and performing matrix operation on the channel illumination characteristics and the local channel illumination characteristics to obtain a channel illumination semantic graph.
In one embodiment, convolving the feature images to obtain the intermediate feature image comprises: and (4) carrying out convolution calculation on the characteristic image by 1x1 and the channel number of 64 to obtain an intermediate characteristic image.
Wherein pooling, also known as downsampling (downsampling), reduces the size of the data; common pooling methods include maximum pooling, average pooling, and the like. In some embodiments, the spatial illumination semantic graph is used as a weight to perform parallel pooling on the intermediate feature images, that is, the spatial illumination semantic graph is multiplied by the intermediate feature images, and then the products are subjected to parallel pooling.
In some embodiments, performing parallel pooling of the weighted feature images comprises performing two parallel pooling; in a specific embodiment, taking the dimension of the image feature of the image to be processed as (56,56,64) as an example, the dimension of one pooled feature is (7,7,64), and the dimension of the other pooled feature is (1,1, 64). Further, performing feature transformation on the pooled features to obtain a channel illumination feature with a dimension (49,64) and a local channel illumination feature with a dimension (1,64), respectively.
In a specific embodiment, performing matrix operation on the channel illumination features and the local channel illumination features to obtain a channel illumination semantic graph, includes: and performing matrix multiplication on the channel illumination characteristics and the local channel illumination characteristics, performing matrix multiplication on the channel illumination characteristics and the local channel illumination characteristics after performing softmax, and performing softmax to obtain a channel illumination semantic graph.
Assuming pooling and dimension transformation operations as P, the above process of determining the channel illumination semantic graph can be represented as:
B0=P(conv(p0)A1;49)
B1=P(conv(p0)A1;1)
wherein p0 represents the feature image to be processed, conv represents the convolution operation, B0 represents the local channel illumination feature, and B1 represents the channel illumination feature.
Figure BDA0003418114900000111
Wherein the content of the first and second substances,
Figure BDA0003418114900000112
represents a pair B0And (5) performing transposition.
According to the technical scheme provided by the embodiment of the disclosure, the characteristic image is convoluted and then multiplied by the space illumination semantic graph, parallel pooling operation and dimension conversion operation are carried out to obtain the channel illumination characteristic and the local channel illumination characteristic, matrix operation is carried out on the two channel illumination characteristics, the query and the coding of the global channel illumination characteristic on the local channel illumination characteristic are calculated, and the channel illumination semantic graph can be obtained for subsequent illumination separation.
In some embodiments, as shown in fig. 3, performing illumination separation on an image to be processed according to a spatial illumination semantic graph and a channel illumination semantic graph to obtain a non-illumination feature of the image to be processed and also obtain an illumination feature of the image to be processed, including steps S301 to S303, where:
s301, respectively obtaining a space non-illumination semantic map and a channel non-illumination semantic map according to the space illumination semantic map and the channel illumination semantic map.
In some embodiments, obtaining the spatial non-illumination semantic map and the channel non-illumination semantic map according to the spatial illumination semantic map and the channel illumination semantic map respectively includes: and determining the difference value between the 1 and the space illumination semantic graph as a space non-illumination semantic graph, and determining the difference value between the 1 and the channel illumination semantic graph as a channel non-illumination semantic graph.
In one embodiment, with A1Representing a spatial illumination semantic graph, B2For example, the channel illumination semantic graph is represented, and the computation space non-illumination semantic graph and the channel non-illumination semantic graph can be represented as follows:
A-=1-A1
B-=1-B2
A-representing a spatial non-illuminated semantic graph, B-Representing channel non-illuminationAnd (5) semantic graphs.
And S302, multiplying the characteristic image by the space non-illumination semantic graph and the channel non-illumination semantic graph to obtain a non-illumination characteristic.
And S303, multiplying the characteristic image by the space illumination semantic graph and the channel illumination semantic graph to obtain the illumination characteristic.
In this embodiment, the feature image is multiplied by the spatial illumination semantic map and the channel illumination semantic map to obtain a feature which is recorded as an initial illumination feature, and the feature image is multiplied by the spatial non-illumination semantic map and the channel non-illumination semantic map to obtain a feature which is recorded as an initial non-illumination feature. Meanwhile, in the embodiment, the process of multiplying the feature image by the two semantic graphs of the space and the channel is referred to as the first stage of the illumination separation.
The above-mentioned first stage of determining the initial illumination characteristic and the initial non-illumination characteristic may be expressed as:
p′0=p0·A1·B2
p″0=p0·A-·B-
p′0denotes the initial illumination characteristic, p ″)0Representing an initial non-illuminated feature, p0Representing a characteristic image.
According to the technical scheme provided by the embodiment of the disclosure, after the space illumination semantic graph and the channel illumination semantic graph are obtained, the space non-illumination semantic graph and the channel non-illumination semantic graph are further determined, the feature images are combined with the semantic graphs to perform feature transformation, the illumination feature and the non-illumination feature are separated from the feature images of the image to be processed, the features in the image to be processed are divided into the illumination feature related to illumination and the non-illumination feature unrelated to illumination, the illumination feature unrelated to illumination is conveniently utilized to perform face recognition, the influence of illumination on the face recognition can be reduced, and the accuracy of the face recognition is improved.
Further, in some embodiments, the non-illumination characteristic is a phase non-illumination characteristic of the first phase, and the illumination characteristic is a phase illumination characteristic of the first phase; as shown in fig. 4, the illumination separation is performed on the image to be processed according to the spatial illumination semantic graph and the channel illumination semantic graph, and the method further includes steps S401 to S405, where:
s401, taking the stage illumination characteristic and the stage non-illumination characteristic of the first stage as the initial illumination characteristic and the initial non-illumination characteristic, and inputting the initial illumination characteristic and the initial non-illumination characteristic into the second stage.
In the present embodiment, the light separation is performed a plurality of times in each stage, so that the light separation effect is as good as possible.
S402, carrying out residual error processing and illumination separation on the initial illumination features to obtain first intermediate illumination features and first intermediate non-illumination features of the second stage, and carrying out residual error processing and illumination separation on the initial non-illumination features to obtain second intermediate illumination features and second intermediate non-illumination features of the second stage.
For the initial illumination feature (the phase illumination feature of the first phase) input in the second phase, residual error processing and illumination separation are sequentially performed, and the initial illumination feature is further split into an illumination feature and a non-illumination feature, which are denoted as a first intermediate illumination feature and a first intermediate non-illumination feature in this embodiment. Similarly, after the initial non-illumination features input in the second stage are sequentially subjected to residual error processing and illumination separation, second intermediate illumination features and second intermediate non-illumination features are obtained.
In a specific embodiment, the residual processing of the initial non-illuminated feature and the initial illuminated feature comprises: and obtaining the non-illumination characteristic after residual processing and the illumination characteristic after residual processing through a residual module with a downsampling value of 2 and an output channel of 128 and through 3 residual modules with a downsampling value of 1 and an output channel of 128.
Further, in a specific embodiment, the illumination separation is performed on the non-illumination features after the residual error processing, and the illumination separation is performed on the illumination features after the residual error processing, similar to the process from step S202 to step S204, the spatial semantic feature map and the channel semantic feature map are separated from the features, and then the illumination features and the non-illumination features are determined by using the semantic maps, which is not described herein again.
And S403, obtaining stage illumination characteristics of the second stage according to the first intermediate illumination characteristics and the second intermediate illumination characteristics, and obtaining stage non-illumination characteristics of the second stage according to the first intermediate non-illumination characteristics and the second intermediate non-illumination characteristics.
In some embodiments, deriving the stage illumination feature of the second stage from the first intermediate illumination feature and the second intermediate illumination feature comprises: and carrying out weighted summation on the first intermediate illumination characteristic and the second intermediate illumination characteristic based on the weight parameters to obtain the stage illumination characteristic of the second stage.
In a specific embodiment, the weighted summation of the first intermediate illumination feature and the second intermediate illumination feature based on the weight parameter to obtain the stage illumination feature of the second stage includes: and multiplying the second intermediate illumination characteristic (the intermediate illumination characteristic obtained by splitting the stage non-illumination characteristic of the first stage) by the weight parameter, and determining the product and the sum of the first intermediate illumination characteristic as the stage illumination characteristic of the second stage.
Determining the stage illumination characteristic of the second stage may be expressed as:
p′1=g′1+σ·h′1
wherein sigma is a weight parameter, g'1Denotes a first intermediate illumination characteristic, h'1Denotes a second intermediate illumination characteristic, p'1Indicating the phase illumination characteristics of the second phase.
In further embodiments, deriving the stage non-illumination feature of the second stage from the first intermediate non-illumination feature and the second intermediate non-illumination feature comprises: and carrying out weighted summation on the first intermediate non-illumination characteristic and the second intermediate non-illumination characteristic based on the weight parameters to obtain the stage non-illumination characteristic of the second stage.
In a specific embodiment, the weighted summation of the first intermediate non-illumination feature and the second intermediate non-illumination feature based on the weighting parameter to obtain the stage non-illumination feature of the second stage includes: and multiplying the first intermediate non-illumination characteristic (the intermediate non-illumination characteristic obtained by splitting the stage illumination characteristic of the first stage) by the weight parameter, and determining the product and the sum of the second non-intermediate illumination characteristic as the stage non-illumination characteristic of the second stage.
Determining the phase non-illumination characteristic of the second phase may be expressed as:
p″1=σ·g″1+h″1
wherein, sigma is a weight parameter, g ″)1Denotes a first intermediate non-illuminated feature, h ″)1Denotes a second intermediate non-illuminated feature, p'1A phase non-illuminated feature representing the second phase.
The weight parameter can be set as a fixed value according to the actual situation, and can also be determined through training; in some embodiments, the weighting parameters of different stages may be set to be the same or different.
S404, taking the stage non-illumination characteristic and the stage illumination characteristic of the second stage as the initial non-illumination characteristic and the initial illumination characteristic, entering the next stage, and repeating the same operation as the second stage.
After the illumination separation of the second stage is carried out, the illumination separation of the third stage and the fourth stage can be carried out again according to the operation of the second stage, so that the accuracy of the illumination separation is further improved, and the finally obtained illumination characteristic and the non-illumination characteristic are better distinguished. Therefore, in the present embodiment, the phase illumination characteristic and the phase non-illumination characteristic calculated in the second phase are taken as the initial illumination characteristic and the initial non-illumination characteristic of the next phase, and the operation similar to the second phase, i.e., the third phase of illumination separation, is repeatedly performed.
S405, after the stages corresponding to the preset number are executed, the obtained stage illumination features and stage non-illumination features are illumination features and non-illumination features respectively.
The preset number may be set according to an actual situation, for example, in a specific embodiment, the preset number is set to 4, that is, the stage illumination feature of the fourth stage obtained after the fourth stage is ended is the final illumination feature, and the stage non-illumination feature of the fourth stage is the final non-illumination feature. In other embodiments, the predetermined number may be set to other values.
In some embodiments, the third stage and the fourth stage are similar to the second stage in operation, and the parameters therein may be set to be different, for example, the number of residual modules and the number of channels of the residual modules are set to be different according to actual situations.
According to the technical scheme provided by the embodiment of the disclosure, after illumination semantic separation is carried out on the feature image in the first stage and illumination separation is carried out on the feature image by using the semantic graph, illumination separation of the second stage, the third stage and the like is carried out by using the illumination features and the non-illumination features obtained in the first stage, and multiple times of illumination separation are respectively carried out on the illumination features and the non-illumination features obtained by calculation in the first stage, so that the illumination separation effect is better, the times of illumination separation are increased along with the increase of the stages, the illumination features in the obtained non-illumination features are less and less, and a more accurate identification result can be obtained during face identification.
In some embodiments, the method is implemented by a neural network, which is obtained by training a preset neural network; in some embodiments, the preset neural network includes at least a first stage, a second stage, a third stage, and a fourth stage.
Further, as shown in fig. 5, the training process of the neural network includes steps S501 to S505:
s501, obtaining a sample image.
The sample image is used for training a preset neural network. In some embodiments, the sample image is an image containing a human face.
Further, in some embodiments, the sample image carries an illumination category label and a face category label. In a specific embodiment, the illumination category labeling includes: the light source is divided into natural light in sunny days, natural light in cloudy days, white light and yellow light, the light source is divided into front light, side light and back light according to the illumination angle, and the illumination categories are marked to include 4x 3+1 (exposure) to 13 types. The face type label is a user ID (identification number) corresponding to the sample image.
S502, inputting the sample image into a preset neural network by taking the initial weight as a current weight parameter, and obtaining a first face recognition prediction probability and a first illumination type prediction probability of the sample image output by the preset neural network.
Wherein, the initial weight can be set according to the actual situation. In one particular embodiment, the initial weight is set to 0.5. The preset neural network performs feature extraction on the input sample image to obtain a sample feature image, and performs illumination semantic separation and illumination separation on the sample feature image at least in a first stage, a second stage, a third stage and a fourth stage to obtain a sample illumination feature and a sample non-illumination feature. And determining the type of the sample illumination characteristic, namely the illumination prediction type of the sample image, and determining the type of the sample non-illumination characteristic, namely the face prediction type of the sample image. According to the illumination type label and the illumination prediction type of the sample image, the first prediction probability of the illumination type of the sample image by the preset neural network can be obtained, and according to the face type label and the face prediction type of the sample image, the first prediction probability of the face recognition of the sample image by the preset neural network can be obtained.
The preset neural network processes the sample image and the type of the process of the trained neural network for processing the image to be processed are not described herein again.
S503, gradually reducing the current weight parameter according to a preset step length to obtain a second prediction probability of the preset neural network for face recognition of the sample image and a second prediction probability of the illumination type.
And adjusting the current weight parameters of the preset neural network, predicting the face recognition result and the illumination type recognition result of the sample image again, and correspondingly obtaining a second prediction probability of the face recognition and a second prediction probability of the illumination type.
In some embodiments, gradually reducing the current weight parameter according to a preset step size includes: and starting from a fourth stage, turning down the current weight parameter of one stage each time based on the preset step length, wherein the stages comprise a second stage, a third stage or a fourth stage.
The preset step length can be set according to actual conditions. In one embodiment, the preset step size is set to 0.1. In a specific embodiment, starting from the fourth stage, the current weight parameter of one stage is reduced each time, for example, in a specific embodiment, the initial weight parameters of the second stage, the third stage and the fourth stage are all set to 0.5, the first prediction probability of face recognition and the first prediction probability of illumination category are calculated, the current weight parameter of the fourth stage is reduced to 0.4, the current weight parameters of the second stage and the third stage are still 0.5, the second prediction probability of face recognition and the second prediction probability of illumination category are calculated, and so on. Further, in one embodiment, after the current weight parameter of the second stage is adjusted to 0.4 and trained to converge, the current weight parameter of the fourth stage is returned to be adjusted down again.
S504, on the premise that the second prediction probability of the face recognition is the same as the first prediction probability of the face recognition and the second prediction probability of the illumination type is the same as the first prediction probability of the illumination type, training the preset neural network to be convergent, and returning to the step of gradually reducing the current weight parameter according to the preset step length.
In some embodiments, a loss function is set to supervise that the second prediction probability of the face recognition is the same as the first prediction probability of the face recognition, the second prediction probability of the illumination class is the same as the first prediction probability of the illumination class:
L=max(p1-p3,0)+max(p2-p4,0)
wherein p is1Representing a first prediction probability, p, of face recognition3Representing a second prediction probability, p, of face recognition2Representing a first prediction probability, p, of a lighting class4Representing a second predicted probability for the illumination class. L has a value of p1-p3The larger of and 0 and p2-p4Sum of values added to larger values of 0, in order to ensure p1=p3,p2=p4Then p is1-p3=0,p2-p4If the second prediction probability of the face recognition is equal to the first prediction probability of the face recognition, the second prediction probability of the illumination type is equal to the first prediction probability of the illumination type.
And when the L meets the corresponding condition, training the neural network to be converged, and returning to adjust the current weight parameter of the next stage.
The neural network convergence can be monitored by loss of the illumination characteristic predicted value and the illumination real category of the illumination characteristic extraction network, loss of the non-illumination characteristic extraction network is monitored by the non-illumination characteristic predicted value and the face ID real category, and when a loss function value meets a set condition, the neural network convergence is judged. The loss function may be any one of loss functions such as cross entropy loss.
And S505, stopping reducing the current weight parameter until the preset neural network is not converged any more, and obtaining the neural network.
After the current weight parameter is reduced, if the neural network can be trained to be converged, the current weight parameter of the next stage can be adjusted, but if the neural network cannot be trained to be converged, the current weight parameter is not adjusted any more, and the current weight parameter is determined as the weight parameter in the neural network.
According to the technical scheme provided by the embodiment of the disclosure, a method for dynamic weight association loss is provided, the current weight parameters of each stage are dynamically reduced in the training process, the preset neural network corresponding to each weight parameter value is gradually trained to be convergent, and if the current weight parameters are reduced and the accuracy is not changed, the neural network can better separate illumination characteristics and non-illumination characteristics in an image, so that a model with better characteristic separation effect is obtained through training, the influence of the illumination characteristics in face recognition can be more reduced, and the face recognition accuracy is improved.
In other embodiments, the weight parameters of each stage may also be set to fixed values, for example, in a specific embodiment, the weight parameters of the second stage, the third stage, and the fourth stage are respectively set to 0.3, 0.2, and 0.1, and the preset neural network corresponding to the weight parameters is trained until convergence, so as to obtain the neural network.
All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.
In a specific embodiment, the steps of the above-mentioned face recognition method are described in a complete embodiment.
Training a preset neural network to determine an illumination separation neural network, wherein the training process is as follows:
the input of the preset neural network is a human face picture with the size of 112x 112. The face picture carries illumination type labels, wherein the illumination type labels are divided into 4 types from a light source, and the illumination type labels comprise natural light in sunny days, natural light in cloudy days, white light and yellow light; the lighting angle is divided into 3 types, including front light, side light and back light. A total of 4x3 equals 12, plus exposures, for a total of 13. And labeling the face picture training set according to the 13 types of illumination types. Meanwhile, each face picture in the face picture training set can also carry user ID class labels, namely, who the face is.
The face picture is processed by a residual error module (BottleNeck in ResNet) with a downsampling value of 2 and an output channel of 64, and then is processed by 2 residual error modules with a downsampling value of 1 and an output channel of 64 to obtain a feature image p0 of (56,56, 64).
P0 then enters as input the "light semantics split" module M0. p0 is first computed in M0 as "spatial lighting semantics". The "spatial illumination semantics" are calculated as follows: p0 enters three calculation modules in parallel, and the flow of each calculation module is as follows: convolution kernel k, convolution calculation with the channel number of 32, batch normalization, relu activation calculation, convolution calculation with convolution kernel 1x1 and the channel number of 1, batch normalization and softmax calculation. The above has a parameter k, and the k values of the three calculation modules are 3,5 and 7 respectively. The three calculation modules have three outputs, the dimensionalities of the three outputs are (56,56, 1), the three outputs are superposed to obtain the dimensionality of A0 (56,56, 3), A0 is subjected to convolution calculation with a convolution kernel 1x1 and the number of channels being 1, and then softmax calculation is carried out to obtain the dimensionality of A1 (56,56, 1). A1 is the spatial illumination semantic map.
The spatial illumination semantic graph can calculate illumination characteristic images of different visual fields through parallel multi-scale convolution, and then fusion calculation is carried out on the characteristic images of the different visual fields, so that the calculation is more accurate.
Second, p0 is to perform "channel illumination semantics" calculation in M0. The "channel illumination semantics" are calculated as follows: p0 is subjected to convolution calculation with 1x1 and channel number of 64, then multiplied by A1, and then subjected to two parallel pooling, one output B0 dimension becomes (7,7,64), and one output B1 dimension becomes (1,1, 64). Here p0 is rolled up and multiplied by a1 for pooling, which is equivalent to using the spatial illumination semantic map as a weight for local pooling. And performing dimension transformation, namely changing the dimension of B0 into (49,64), changing the dimension of B1 into (1,64), performing matrix multiplication on B1 and B0, performing softmax, performing matrix multiplication on B0 to obtain B2(1, 64), and performing softmax, wherein the dimension is further changed into (1,1, 64).
In the calculation of the channel illumination semantic graph, B1 can be regarded as a channel illumination feature, B0 can be regarded as a local channel illumination feature, and the query and encoding of the local illumination feature by the global illumination feature are calculated by subsequent matrix operation. B2 is the channel illumination semantic map.
Further, spatial non-illumination semantics A-And channel non-illumination semantics B-And can be obtained by 1 subtraction.
Performing illumination semantic separation on the feature image p 0: multiplying the space illumination semantic graph and the channel illumination semantic graph by p0 to obtain an illumination feature p'0(ii) a Multiplying the space non-illumination semantic graph and the channel non-illumination semantic graph by p0 to obtain a non-illumination characteristic p ″0
The whole calculation process is a first stage of 'illumination separation neural network', the input of the first stage is a picture, and the output is illumination characteristics and non-illumination characteristics corresponding to the picture.
In the second stage, a branch calculation scheme is designed. The illumination features and non-illumination features obtained from the first stage are respectively extracted through two branches (including a plurality of residual modules to extract features, and an illumination semantic separation module to separate features). The method comprises the following specific steps:
p′0obtaining a characteristic diagram g of (28, 28, 128) through a residual module with a downsampling of 2 and an output channel of 128 and through 3 residual modules with a downsampling of 1 and an output channel of 1281Then, g1Entering an illumination separation module to obtain an illumination characteristic diagram g'1And a non-illuminated characteristic diagram g ″1
p″0Obtaining a feature map h of (28, 28, 128) through a residual module with a downsampling of 2 and an output channel of 128 and through 3 residual modules with a downsampling of 1 and an output channel of 1281Then, h1Entering an illumination separation module to obtain an illumination characteristic graph h'1And a non-illuminated characteristic diagram h ″1
G'1And h'1Adding to obtain stage illumination characteristic map p 'of the second stage'1
Mixing g ″)1And h ″)1Adding to obtain stage non-illumination characteristic diagram p ″' of the second stage1
When adding, a weight parameter σ is added, which can be expressed as:
p′1=g′1+σ·h′1
p″1=σ·g″1+h″1
in this embodiment, the third stage and the fourth stage of the illumination separation neural network are similar to the second stage in calculation, only some parameters are different, the number of residual modules is 6 and 4, and the number of channels of the residual modules is 256 and 512.
In the fourth stage, the output p 'of the neural network is obtained'3And p ″)3. And pooling the two characteristic graphs respectively, and then connecting the two full-connection layers for classified prediction. Illumination feature map p'3Through pooling and full connection, the illumination type and non-illumination characteristic graph p' of the face graph are predicted3And predicting the ID category of the face image through pooling and full connection. The extraction network of the illumination features is supervised by cross entropy loss of the illumination feature predicted value and the illumination real category; and the extraction network of the non-illumination features is supervised by cross entropy loss of the non-illumination feature prediction value and the real category of the face ID.
In the above technical solution, the second stage, the third stage and the fourth stage of the neural network all include two branches, wherein one branch calculates the illumination characteristic diagram, and the other branch calculates the non-illumination characteristic diagram.
In this embodiment, in order to achieve the purpose of separating the illumination features and ensure the accuracy of model prediction, a scheme of dynamic weight association loss is proposed:
firstly, the weight parameter σ of the second stage, the third stage and the fourth stage is 0.5, and the preset neural network is trained to converge.
Secondly, for the input picture, the class probability p of the output of the two branches with sigma of 0.5 is calculated firstly1And p2Then, the sigma of the fourth stage is adjusted to 0.4 to obtain the class probability p of the two branch outputs3And p4. In this embodiment, after the σ is reduced, the probability of outputting the corresponding category cannot be reduced, which means that the model can well separate the illumination feature and the non-illumination feature. The probability that the output corresponding category cannot be decreased after σ is turned down can be constrained by the following loss function:
L=max(p1-p3,0)+max(p2-p4,0)
if σ is turned down and accuracy is not changed, then neural network features separate better. Further, after the model converges, the third stage may be adjusted down to 0.4, and the second stage may be adjusted down. And so on. If the model can not be converged to the same accuracy rate after the sigma of a certain stage is reduced, the stage is not reduced any more, and the final illumination separation neural network is obtained.
Secondly, after the model training is finished, the trained neural network is used for carrying out illumination separation and face recognition on the image to be processed. A human face picture (to-be-processed image) is input to the neural network, and corresponding illumination features and non-illumination features can be obtained. Wherein, the illumination type of the image to be processed can be determined according to the illumination characteristics; secondly, due to the existence of the illumination separation technology, the non-illumination characteristic can avoid the interference of illumination on other obvious characteristics of the human face, so that the characteristic of eliminating the illumination interference can be well extracted, the non-illumination characteristic is utilized to carry out human face comparison or human face recognition, and a better human face recognition result can be obtained. Further, the face can be rejected for recognition if the preset illumination condition is determined not to be met according to the illumination type, at this time, the network is possibly attacked, or the illumination is too extreme, the recognition is inaccurate, and unknown risks can be prevented.
The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.
Fig. 6 is a schematic diagram of a face recognition apparatus according to an embodiment of the present disclosure. As shown in fig. 6, the face recognition apparatus includes:
an obtaining module 601 configured to obtain an image to be processed.
The feature extraction module 602 is configured to perform feature extraction on the image to be processed, so as to obtain a feature image of the image to be processed.
The first feature transformation module 603 is configured to perform feature transformation on the feature image to obtain a spatial illumination semantic map corresponding to the image to be processed.
The second feature transformation module 604 is configured to perform feature transformation on the feature image based on the spatial illumination semantic map to obtain a channel illumination semantic map corresponding to the image to be processed.
And the illumination separation module 605 is configured to perform illumination separation on the image to be processed according to the spatial illumination semantic graph and the channel illumination semantic graph to obtain a non-illumination characteristic of the image to be processed.
And the face recognition module 606 is configured to perform face recognition based on the non-illumination features to obtain a face recognition result.
According to the technical scheme provided by the embodiment of the disclosure, a space illumination semantic graph and a channel illumination semantic graph of an image to be processed are obtained by performing feature extraction and feature transformation on the image to be processed, illumination separation is performed according to the space illumination semantic graph and the channel illumination semantic graph to obtain non-illumination features in the image to be processed, and finally face recognition is performed on the non-illumination features to obtain a face recognition result. The device performs illumination separation by calculating the space illumination semantic graph and the channel illumination semantic graph and separates non-illumination characteristics without illumination characteristics from the image to be processed for face recognition, so that interference of the illumination characteristics in the image on the face recognition can be reduced as much as possible, a better face recognition result is obtained, and the accuracy of the face recognition is improved.
In some embodiments, the illumination separation module 605 of the above apparatus is further configured to obtain an illumination characteristic of the image to be processed; in this embodiment, the apparatus further includes: an illumination type prediction module 607 configured to determine an illumination type of the image to be processed according to the illumination characteristics; when the illumination type meets the preset illumination condition, the face recognition module 606 performs face recognition based on the non-illumination characteristic to obtain a face recognition result.
In some embodiments, the first feature transformation module 603 of the apparatus comprises: a convolution submodule 608 configured to perform parallel multi-scale convolution calculation on the feature image to obtain spatial illumination feature images of different views; and the fusion sub-module 609 is configured to fuse the space illumination characteristic images to obtain a space illumination semantic graph.
In some embodiments, the second feature transformation module 604 of the apparatus comprises: a convolution sub-module 610 configured to convolve the feature images, resulting in intermediate feature images; the feature processing sub-module 611 is configured to perform parallel pooling and dimension transformation on the intermediate feature image by using the spatial illumination semantic graph as a weight to obtain a channel illumination feature and a local channel illumination feature; the matrix operation sub-module 612 is configured to perform matrix operation on the channel illumination features and the local channel illumination features to obtain a channel illumination semantic graph.
In some embodiments, the illumination separation module 605 of the above apparatus comprises: a non-illumination semantic map determining sub-module 613 configured to obtain a spatial non-illumination semantic map and a channel non-illumination semantic map according to the spatial illumination semantic map and the channel illumination semantic map, respectively; a non-illumination characteristic determination sub-module 614, configured to multiply the characteristic image with the space non-illumination semantic map and the channel non-illumination semantic map to obtain a non-illumination characteristic; and the illumination characteristic determination sub-module 615 is configured to multiply the characteristic image with the space illumination semantic graph and the channel illumination semantic graph to obtain an illumination characteristic.
In some embodiments, the non-illumination characteristic is a phase non-illumination characteristic of the first phase and the illumination characteristic is a phase illumination characteristic of the first phase. In this embodiment, the illumination separation module 605 of the apparatus further includes: taking the stage illumination characteristic and the stage non-illumination characteristic of the first stage as an initial illumination characteristic and an initial non-illumination characteristic, and inputting the initial illumination characteristic and the initial non-illumination characteristic into the second stage; an illumination separation sub-module 616 configured to perform residual error processing and illumination separation on the initial illumination features to obtain first intermediate illumination features and first intermediate non-illumination features of the second stage, and perform residual error processing and illumination separation on the initial non-illumination features to obtain second intermediate illumination features and second intermediate non-illumination features of the second stage; a stage characteristic determining sub-module 617 configured to obtain a stage illumination characteristic of the second stage according to the first intermediate illumination characteristic and the second intermediate illumination characteristic, and obtain a stage non-illumination characteristic of the second stage according to the first intermediate non-illumination characteristic and the second intermediate non-illumination characteristic; a circulation submodule 618 configured to enter a next stage with the stage non-illumination characteristic and the stage illumination characteristic of the second stage as an initial non-illumination characteristic and an initial illumination characteristic, and repeat the same operation as the second stage; and after the stages corresponding to the preset number are executed, the obtained stage illumination characteristics and stage non-illumination characteristics are respectively illumination characteristics and non-illumination characteristics.
In some embodiments, the phase characteristic determination submodule of the apparatus is further configured to: weighting and summing the first intermediate illumination characteristic and the second intermediate illumination characteristic based on the weight parameters to obtain a stage illumination characteristic of a second stage; and carrying out weighted summation on the first intermediate non-illumination characteristic and the second intermediate non-illumination characteristic based on the weight parameters to obtain the stage non-illumination characteristic of the second stage.
In some embodiments, the apparatus further comprises a model training module 619, comprising:
an image acquisition sub-module configured to acquire a sample image;
the prediction probability determination submodule is configured to input the sample image into a preset neural network by taking the initial weight as a current weight parameter of each stage, and obtain a first prediction probability of face recognition and a first prediction probability of an illumination category of the sample image output by the preset neural network;
the weight adjusting submodule is configured to gradually reduce the current weight parameter according to a preset step length to obtain a second prediction probability of the preset neural network for face recognition of the sample image and a second prediction probability of the illumination type;
the training sub-module is configured to train the preset neural network to be convergent on the premise of ensuring that the second prediction probability of the face recognition is the same as the first prediction probability of the face recognition and the second prediction probability of the illumination type is the same as the first prediction probability of the illumination type, and the step of gradually reducing the current weight parameter according to the preset step length is returned; and stopping reducing the current weight parameter until the preset neural network is not converged any more, so as to obtain the neural network.
In some embodiments, the weight adjustment submodule of the apparatus is further configured to: and starting from a fourth stage, turning down the current weight parameter of one stage each time based on the preset step length, wherein the stages comprise a second stage, a third stage or a fourth stage.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure.
Fig. 7 is a schematic diagram of an electronic device 7 provided by the embodiment of the present disclosure. As shown in fig. 7, the electronic apparatus 7 of this embodiment includes: a processor 701, a memory 702, and a computer program 703 stored in the memory 702 and executable on the processor 701. The steps in the various method embodiments described above are implemented when the computer program 703 is executed by the processor 701. Alternatively, the processor 701 implements the functions of each module/unit in each device embodiment described above when executing the computer program 703.
Illustratively, the computer program 703 may be partitioned into one or more modules/units, which are stored in the memory 702 and executed by the processor 701 to accomplish the present disclosure. One or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 703 in the electronic device 7.
The electronic device 7 may be a desktop computer, a notebook, a palm computer, a cloud server, or other electronic devices. The electronic device 7 may include, but is not limited to, a processor 701 and a memory 702. Those skilled in the art will appreciate that fig. 7 is merely an example of the electronic device 7, does not constitute a limitation of the electronic device 7, and may include more or less components than those shown, or combine certain components, or different components, e.g., the electronic device may also include input-output devices, network access devices, buses, etc.
The Processor 701 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 702 may be an internal storage unit of the electronic device 7, for example, a hard disk or a memory of the electronic device 7. The memory 702 may also be an external storage device of the electronic device 7, such as a plug-in hard disk provided on the electronic device 7, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 702 may also include both an internal storage unit of the electronic device 7 and an external storage device. The memory 702 is used to store computer programs and other programs and data required by the electronic device. The memory 702 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules, so as to perform all or part of the functions described above. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
In the embodiments provided in the present disclosure, it should be understood that the disclosed apparatus/electronic device and method may be implemented in other ways. For example, the above-described apparatus/electronic device embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logical function, and may be implemented in other ways, and multiple units or components may be combined or integrated into another system, or some features may be omitted or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the present disclosure may implement all or part of the flow of the method in the above embodiments, and may also be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of the above methods and embodiments. The computer program may comprise computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain suitable additions or additions that may be required in accordance with legislative and patent practices within the jurisdiction, for example, in some jurisdictions, computer readable media may not include electrical carrier signals or telecommunications signals in accordance with legislative and patent practices.
The above examples are only intended to illustrate the technical solutions of the present disclosure, not to limit them; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present disclosure, and are intended to be included within the scope of the present disclosure.

Claims (12)

1. A face recognition method, comprising:
acquiring an image to be processed;
extracting the features of the image to be processed to obtain a feature image of the image to be processed;
performing feature transformation on the feature image to obtain a space illumination semantic graph corresponding to the image to be processed;
performing feature transformation on the feature image based on the space illumination semantic graph to obtain a channel illumination semantic graph corresponding to the image to be processed;
according to the space illumination semantic graph and the channel illumination semantic graph, carrying out illumination separation on the image to be processed to obtain non-illumination characteristics of the image to be processed;
and carrying out face recognition based on the non-illumination characteristics to obtain a face recognition result.
2. The method according to claim 1, wherein the illumination separation is performed on the image to be processed according to the space illumination semantic graph and the channel illumination semantic graph, so that the illumination characteristic of the image to be processed is obtained while the non-illumination characteristic of the image to be processed is obtained;
the method further comprises the following steps: determining the illumination type of the image to be processed according to the illumination characteristics;
and when the illumination type meets a preset illumination condition, performing face recognition based on the non-illumination characteristic to obtain a face recognition result.
3. The method according to claim 1, wherein the performing the feature transformation on the feature image to obtain the spatial illumination semantic map corresponding to the image to be processed includes:
performing parallel multi-scale convolution calculation on the characteristic image to obtain space illumination characteristic images of different views;
and fusing the space illumination characteristic images to obtain the space illumination semantic graph.
4. The method according to claim 1, wherein the performing the feature transformation on the feature image based on the spatial illumination semantic map to obtain a channel illumination semantic map corresponding to the image to be processed comprises:
performing convolution on the characteristic image to obtain an intermediate characteristic image;
performing parallel pooling and dimension transformation on the intermediate characteristic image by using the space illumination semantic graph as weight to obtain a channel illumination characteristic and a local channel illumination characteristic;
and performing matrix operation on the channel illumination characteristics and the local channel illumination characteristics to obtain the channel illumination semantic graph.
5. The method according to claim 2, wherein the performing illumination separation on the image to be processed according to the spatial illumination semantic map and the channel illumination semantic map comprises:
respectively obtaining a space non-illumination semantic graph and a channel non-illumination semantic graph according to the space illumination semantic graph and the channel illumination semantic graph;
multiplying the characteristic image by the space non-illumination semantic graph and the channel non-illumination semantic graph to obtain the non-illumination characteristic;
and multiplying the characteristic image by the space illumination semantic graph and the channel illumination semantic graph to obtain the illumination characteristic.
6. The method of claim 4, wherein the non-illuminated feature is a phase non-illuminated feature of a first phase, and the illuminated feature is a phase illuminated feature of the first phase; the illumination separation is carried out on the image to be processed according to the space illumination semantic graph and the channel illumination semantic graph, and the method further comprises the following steps:
taking the stage illumination characteristic and the stage non-illumination characteristic of the first stage as an initial illumination characteristic and an initial non-illumination characteristic, and inputting the initial illumination characteristic and the initial non-illumination characteristic into a second stage;
performing residual error processing and illumination separation on the initial illumination characteristics to obtain first intermediate illumination characteristics and first intermediate non-illumination characteristics of a second stage, and performing residual error processing and illumination separation on the initial non-illumination characteristics to obtain second intermediate illumination characteristics and second intermediate non-illumination characteristics of the second stage;
obtaining stage illumination characteristics of a second stage according to the first intermediate illumination characteristics and the second intermediate illumination characteristics, and obtaining stage non-illumination characteristics of the second stage according to the first intermediate non-illumination characteristics and the second intermediate non-illumination characteristics;
taking the stage non-illumination characteristic and the stage illumination characteristic of the second stage as an initial non-illumination characteristic and an initial illumination characteristic, entering the next stage, and repeating the same operation as the second stage;
and after the stages corresponding to the preset number are executed, the stage illumination characteristics and the stage non-illumination characteristics are respectively the illumination characteristics and the non-illumination characteristics.
7. The method of claim 6, wherein:
the obtaining of the stage illumination characteristic of the second stage according to the first intermediate illumination characteristic and the second intermediate illumination characteristic includes: carrying out weighted summation on the first intermediate illumination characteristic and the second intermediate illumination characteristic based on a weight parameter to obtain a stage illumination characteristic of the second stage;
the obtaining of the stage non-illumination feature of the second stage according to the first intermediate non-illumination feature and the second intermediate non-illumination feature comprises: and carrying out weighted summation on the first intermediate non-illumination characteristic and the second intermediate non-illumination characteristic based on the weight parameters to obtain the stage non-illumination characteristic of the second stage.
8. The method of claim 7, wherein the method is implemented by a neural network; the neural network is obtained by training a preset neural network; the training process of the neural network comprises the following steps:
acquiring a sample image;
inputting the sample image into the preset neural network by taking the initial weight as a current weight parameter of each stage to obtain a first prediction probability of face recognition and a first prediction probability of illumination type of the sample image output by the preset neural network;
gradually reducing the current weight parameter according to a preset step length to obtain a second prediction probability of the preset neural network for face recognition of the sample image and a second prediction probability of the illumination type;
on the premise of ensuring that the second prediction probability of the face recognition is the same as the first prediction probability of the face recognition and the second prediction probability of the illumination type is the same as the first prediction probability of the illumination type, training the preset neural network to be convergent, and returning to the step of gradually reducing the current weight parameter according to a preset step length;
and stopping reducing the current weight parameter until the preset neural network is not converged any more, so as to obtain the neural network.
9. The method of claim 8, wherein the step-down of the current weight parameter by a preset step size comprises:
starting from a fourth phase, turning down the current weight parameter of one phase each time based on the preset step length, wherein the phase comprises the second phase, the third phase or the fourth phase.
10. A face recognition apparatus, comprising:
an acquisition module configured to acquire an image to be processed;
the characteristic extraction module is configured to extract the characteristics of the image to be processed to obtain a characteristic image of the image to be processed;
the first feature transformation module is configured to perform feature transformation on the feature image to obtain a spatial illumination semantic graph corresponding to the image to be processed;
the second feature transformation module is configured to perform feature transformation on the feature image based on the spatial illumination semantic graph to obtain a channel illumination semantic graph corresponding to the image to be processed;
the illumination separation module is configured to perform illumination separation on the image to be processed according to the space illumination semantic graph and the channel illumination semantic graph to obtain a non-illumination characteristic of the image to be processed;
and the face recognition module is configured to perform face recognition based on the non-illumination characteristics to obtain a face recognition result.
11. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 9 when executing the computer program.
12. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 9.
CN202111552557.2A 2021-12-17 2021-12-17 Face recognition method and device, electronic equipment and computer readable storage medium Pending CN114332993A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111552557.2A CN114332993A (en) 2021-12-17 2021-12-17 Face recognition method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111552557.2A CN114332993A (en) 2021-12-17 2021-12-17 Face recognition method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114332993A true CN114332993A (en) 2022-04-12

Family

ID=81052523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111552557.2A Pending CN114332993A (en) 2021-12-17 2021-12-17 Face recognition method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114332993A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024044915A1 (en) * 2022-08-29 2024-03-07 西门子股份公司 Image comparison method and apparatus for error detection, and computer device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024044915A1 (en) * 2022-08-29 2024-03-07 西门子股份公司 Image comparison method and apparatus for error detection, and computer device

Similar Documents

Publication Publication Date Title
US11436739B2 (en) Method, apparatus, and storage medium for processing video image
EP3637317A1 (en) Method and apparatus for generating vehicle damage information
CN110866471A (en) Face image quality evaluation method and device, computer readable medium and communication terminal
CN111860573A (en) Model training method, image class detection method and device and electronic equipment
EP4137991A1 (en) Pedestrian re-identification method and device
CN111444744A (en) Living body detection method, living body detection device, and storage medium
CN109977832B (en) Image processing method, device and storage medium
CN110941978B (en) Face clustering method and device for unidentified personnel and storage medium
CN112766284B (en) Image recognition method and device, storage medium and electronic equipment
WO2023173646A1 (en) Expression recognition method and apparatus
CN112330624A (en) Medical image processing method and device
CN114330565A (en) Face recognition method and device
CN114332993A (en) Face recognition method and device, electronic equipment and computer readable storage medium
CN112215831B (en) Method and system for evaluating quality of face image
WO2023029559A1 (en) Data processing method and apparatus
CN112287945A (en) Screen fragmentation determination method and device, computer equipment and computer readable storage medium
CN115577768A (en) Semi-supervised model training method and device
WO2021189321A1 (en) Image processing method and device
CN117036658A (en) Image processing method and related equipment
CN113971830A (en) Face recognition method and device, storage medium and electronic equipment
CN114140427A (en) Object detection method and device
CN113610856A (en) Method and device for training image segmentation model and image segmentation
CN112070022A (en) Face image recognition method and device, electronic equipment and computer readable medium
CN113869253A (en) Living body detection method, living body training device, electronic apparatus, and medium
CN113762037A (en) Image recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221227

Address after: 518054 cable information transmission building 25f2504, no.3369 Binhai Avenue, Haizhu community, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province

Applicant after: Shenzhen Xumi yuntu Space Technology Co.,Ltd.

Address before: No.103, no.1003, Nanxin Road, Nanshan community, Nanshan street, Nanshan District, Shenzhen City, Guangdong Province

Applicant before: Shenzhen Jizhi Digital Technology Co.,Ltd.