CN112348783B

CN112348783B - Image-based person identification method and device and computer-readable storage medium

Info

Publication number: CN112348783B
Application number: CN202011167731.7A
Authority: CN
Inventors: 张森; 黄学涛; 黄思源; 吴宏扬; 许云侠; 任世祥; 李小雨; 邓易; 刘海军; 游可欣
Original assignee: Jianjian Tong Sanya International Technology Co ltd
Current assignee: Jianjian Tong Sanya International Technology Co ltd
Priority date: 2020-10-27
Filing date: 2020-10-27
Publication date: 2022-08-05
Anticipated expiration: 2040-10-27
Also published as: CN112348783A

Abstract

The invention discloses a person identification method, a person identification device and a computer readable storage medium based on images, wherein the method comprises the following steps: acquiring an image to be identified; performing image cutting on the image to be identified to obtain a plurality of sub-images; extracting the face features of each sub-image in the plurality of sub-images to obtain at least one face feature image; performing resolution enhancement processing on each face feature image in the at least one face feature image to obtain a high-resolution face feature image corresponding to each face feature image; and carrying out face recognition on each high-resolution face feature image in the high-resolution face feature images to obtain a face recognition result. The invention can effectively deal with the character recognition of the low-resolution image which causes the loss of the picture information after the coding compression, thereby completely satisfying the character recognition of the webpage crawling picture.

Description

Image-based person identification method and device and computer-readable storage medium

Technical Field

The invention relates to the technical field of image recognition, in particular to a person recognition method and device based on an image and a computer readable storage medium.

Background

With the development of the internet of things technology, people in news pictures are identified by face identification, so that the identification of people in the news pictures becomes a mature technology; at present, the face recognition solution mainly aims at the existing open source image data set, most of the open source face data have high resolution, and the context of the image is easy to extract, so that the existing pre-training model for face detection and recognition can effectively deal with the high-resolution image transmitted by a non-webpage.

However, most of news pictures are obtained by a web page crawling method at present, and loss of image edge information and semantic expression of pixel points in local areas of human faces is caused due to transmission compression, so that target information of the human faces in the news pictures is lost.

Disclosure of Invention

In order to solve the problem that the existing picture character recognition method only can aim at high-resolution images transmitted by non-web pages but cannot recognize characters of pictures crawled by the web pages, the invention aims to provide a recognition method, a recognition device and a computer readable storage medium which can effectively recognize the characters in the pictures crawled by the web pages.

In a first aspect, the present invention provides a person identification method based on an image, including:

acquiring an image to be identified;

performing image cutting on the image to be identified to obtain a plurality of sub-images;

extracting the face features of each sub-image in the plurality of sub-images to obtain at least one face feature image;

performing resolution enhancement processing on each face feature image in the at least one face feature image to obtain a high-resolution face feature image corresponding to each face feature image;

and carrying out face recognition on each high-resolution face feature image in the high-resolution face feature images to obtain a face recognition result.

Based on the disclosure, in the invention, by cutting the image to be recognized and extracting the face features of each sub-image obtained by cutting, the edge information of the image can be extracted as much as possible, the loss and the deletion of the face image information are prevented, the face priori knowledge and the semantic information in the image to be recognized are fully utilized, and the information representation of the image data is enhanced; meanwhile, the resolution enhancement processing is also carried out on the human face feature image after the human face feature extraction, so that the resolution of the image can be increased, excellent image preprocessing data are provided for the subsequent human face recognition, and the recognition precision is improved.

Through the design, the image of the image to be recognized is segmented, the face characteristic of each sub-image obtained by segmentation is extracted, and the image edge information can be fully extracted, so that the loss of the face information is avoided, and meanwhile, the resolution enhancement processing is carried out on the extracted face characteristic image, so that the resolution of the image can be increased, and the recognition precision is increased; therefore, the invention can effectively deal with the character recognition of the low-resolution image which causes the loss of the picture information after the coding compression, thereby completely satisfying the character recognition of the webpage crawling picture.

In one possible design, performing face feature extraction on each of the plurality of sub-images to obtain at least one face feature map, including:

screening each subimage, removing subimages without human faces to obtain at least one screened subimage;

performing first convolution processing on each screened sub-image in the at least one screened sub-image to obtain a face position feature map corresponding to each screened sub-image;

and extracting a face image area in the face position feature map to obtain the at least one face feature map.

Based on the disclosure, the invention discloses specific steps of face feature extraction, namely, image screening is firstly carried out, sub-images without faces are removed, then the screened sub-images are convoluted, the position of the face in each screened sub-image is identified, finally, a face image area in each screened sub-image is extracted according to the face position, and further a face feature image is obtained.

In one possible design, performing resolution enhancement processing on each of the at least one face feature map to obtain a high-resolution face feature map corresponding to each face feature map, includes:

carrying out enhancement processing on the image contrast frequency domain range of each face feature image to obtain a contrast enhanced face feature image corresponding to each face feature image;

and performing super-resolution reconstruction processing on each contrast enhanced face feature image in the contrast enhanced face feature images to obtain a high-resolution face feature image corresponding to each face feature image.

Based on the disclosure, the invention discloses specific steps of enhancing the resolution of a face feature map, namely, two-step enhancement is carried out, wherein the enhancement processing of the image contrast frequency domain range is the increase of the local resolution of an image, and the super-resolution reconstruction processing is the enhancement of the spatial information of image area pixels, so that the pixel area of the face five-sense organ information is clearer. Through the design, the resolution of the image can be greatly improved, excellent image preprocessing data are provided for subsequent face recognition, and the recognition precision is improved.

In one possible design, performing enhancement processing on an image contrast frequency domain range on each face feature image to obtain a contrast-enhanced face feature image corresponding to each face feature image, including:

selecting any pixel point from each face feature map, and obtaining a local area which takes the pixel point as the center and has the size of (2n +1) × (2n +1), wherein n is an integer;

carrying out image low-frequency processing on the local area to obtain a low-frequency local area;

locally correcting the low-frequency local area to obtain a low-frequency local area with enhanced contrast;

and selecting the next pixel point, and repeating the steps until each pixel point in each face feature image is selected completely, so as to obtain a contrast enhanced face feature image corresponding to each face feature image.

Based on the disclosure, the invention discloses a specific step of performing image contrast frequency domain range enhancement processing on each face feature map, namely, selecting any pixel point in each face feature map, taking the pixel point as a center, selecting a region with the size of (2n +1) × (2n +1) as a local region, representing the contrast of the local region by performing low-frequency processing on the image of the local region, then enhancing the contrast of the region by correcting the local region, and finally completing the contrast enhancement on each face feature map after traversing each pixel point in each face feature map to obtain a contrast enhanced face feature map.

In one possible design, performing super-resolution reconstruction processing on each of the contrast-enhanced face feature maps to obtain a high-resolution face feature map corresponding to each of the face feature maps, includes:

performing channel number expansion on each contrast enhanced face feature image in the contrast enhanced face feature images to obtain a multi-channel face feature image corresponding to each contrast enhanced face feature image;

performing second convolution processing on each multi-channel face feature image in the multi-channel face feature images to obtain a pixel face feature image of each multi-channel face feature image;

and rearranging the pixels of each pixel face feature image in the pixel face feature image according to the spatial distribution format of the pixels in the corresponding sub-image to obtain the high-resolution face feature image.

Based on the above disclosure, the present invention discloses a specific step of performing super-resolution reconstruction processing on each contrast-enhanced face feature map, namely, firstly performing channel number expansion (equivalent to image amplification), secondly performing second convolution processing to realize recombination of pixel points in the amplified image, and finally rearranging the pixel points (namely, the position relationship of the person is unchanged) according to the spatial distribution format of the pixels in the original sub-image, so as to obtain an image with enhanced pixels (equivalent to performing pixel enhancement on a face region in the face feature map to make the face region more clear after amplification). Through the design, the facial features can be clearer, and the recognition precision is improved.

In one possible design, performing face recognition on each of the high-resolution face feature maps to obtain a face recognition result, including:

performing side face correction on each high-resolution face feature image to obtain a face correction feature image corresponding to each high-resolution face feature image;

and carrying out face recognition on each face correction feature image in the face correction feature images to obtain a face recognition result.

Based on the disclosure, the invention discloses specific steps of face recognition, namely advanced face correction, because the faces in many images are often side faces, if the side faces are not corrected, the recognition error is larger, and the recognition precision is reduced; therefore, the invention can further improve the recognition precision by firstly correcting the side face of the human face and then recognizing the human face.

In one possible design, performing face recognition on each of the face correction feature maps to obtain the face recognition result, including:

carrying out image comparison on the face in each face correction feature image and the labeled face in the face database to obtain an image comparison result;

judging whether the image comparison result meets a preset threshold value or not;

and if so, taking the labeled human face corresponding to the image comparison result which accords with a preset threshold value as the human face recognition result.

Based on the disclosure, the invention discloses a specific step of carrying out face recognition on a face correction characteristic image, namely carrying out image comparison with a label face stored in a face database to obtain an image comparison result, and then taking a face label corresponding to the image comparison result which meets a preset threshold value as a recognition result.

In a second aspect, the present invention provides an image-based person recognition apparatus, comprising: the system comprises an acquisition unit, an image cutting unit, a human face feature extraction unit, a resolution enhancement unit and a human face recognition unit;

the acquisition unit is used for acquiring an image to be identified;

the image cutting unit is used for carrying out image cutting on the image to be identified to obtain a plurality of sub-images;

the face feature extraction unit is used for extracting the face feature of each sub-image in the plurality of sub-images to obtain at least one face feature image;

the resolution enhancement unit is used for performing resolution enhancement processing on each face feature image in the at least one face feature image to obtain a high-resolution face feature image corresponding to each face feature image;

and the face recognition unit is used for carrying out face recognition on each high-resolution face feature image in the high-resolution face feature images to obtain a face recognition result.

In one possible design, the face feature extraction unit includes: the system comprises a screening subunit, a first convolution processing subunit and a face feature image extracting subunit;

the screening subunit is used for screening each sub-image, removing the sub-images without the human face and obtaining at least one screened sub-image;

the first convolution processing subunit is configured to perform first convolution processing on each filtered sub-image of the at least one filtered sub-image to obtain a face position feature map corresponding to each filtered sub-image;

the face feature map extraction subunit is configured to extract a face image region in the face position feature map to obtain the at least one face feature map.

In one possible design, the resolution enhancement unit includes: a contrast frequency domain range enhancer unit and a super-resolution reconstruction subunit;

the contrast frequency domain range increasing subunit is used for performing enhancement processing on the image contrast frequency domain range on each human face feature image to obtain a contrast enhanced human face feature image corresponding to each human face feature image;

and the super-resolution reconstruction subunit is used for performing super-resolution reconstruction processing on each contrast enhanced face feature image in the contrast enhanced face feature images to obtain a high-resolution face feature image corresponding to each face feature image.

In one possible design;

the contrast frequency domain range increasing subunit is specifically configured to select any pixel point in each face feature map, and obtain a local area which is centered on the pixel point and has a size of (2n +1) × (2n +1), where n is an integer;

the contrast frequency domain range increasing subunit is specifically configured to perform image low-frequency processing on the local region to obtain a low-frequency local region;

the contrast frequency domain range increasing subunit is specifically configured to perform local correction on the low-frequency local region to obtain a low-frequency local region with enhanced contrast;

and the contrast frequency domain range increasing subunit is further specifically used for selecting a next pixel point, and repeating the steps until each pixel point in each face feature image is selected completely, so as to obtain a contrast enhanced face feature image corresponding to each face feature image.

In one possible design;

the super-resolution reconstruction subunit is specifically configured to perform channel number expansion on each contrast-enhanced face feature map in the contrast-enhanced face feature maps to obtain a multi-channel face feature map corresponding to each contrast-enhanced face feature map;

the super-resolution reconstruction subunit is specifically configured to perform second convolution processing on each multi-channel face feature map in the multi-channel face feature maps to obtain a pixel face feature map of each multi-channel face feature map;

the super-resolution reconstruction subunit is further specifically configured to rearrange the pixels of each of the pixel face feature maps in the pixel face feature map according to a spatial distribution format of the pixels in the corresponding sub-image, so as to obtain the high-resolution face feature map.

In one possible design, the face recognition unit includes: a side face rectification subunit and an image identification subunit;

the side face correction subunit is configured to perform side face correction on the face image of each high-resolution face feature map to obtain a face correction feature map corresponding to each high-resolution face feature map;

and the image identification subunit is used for carrying out face identification on each face correction characteristic image in the face correction characteristic images to obtain the face identification result.

In one possible design;

the image identification subunit is specifically configured to perform image comparison on the face in each face correction feature image and the labeled face in the face database to obtain an image comparison result;

the image identification subunit is further specifically configured to determine whether the image comparison result meets a preset threshold, and if so, take a labeled face corresponding to the image comparison result meeting the preset threshold as the face identification result.

In a third aspect, the present invention provides a second image-based person identification apparatus, comprising a memory, a processor and a transceiver, which are sequentially connected in communication, wherein the memory is used for storing a computer program, the transceiver is used for transmitting and receiving messages, and the processor is used for reading the computer program and executing the image-based person identification method as described in the first aspect or any one of the possible designs of the first aspect.

In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon instructions which, when run on a computer, perform the image-based person recognition method as described in the first aspect or any one of the possible designs of the first aspect.

In a fifth aspect, the present invention provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the image-based person recognition method as described in the first aspect or any one of the possible designs of the first aspect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a person identification method based on images according to the present invention.

Fig. 2 is a schematic structural diagram of a first image-based person recognition apparatus according to the present invention.

Fig. 3 is a schematic structural diagram of a second image-based person recognition apparatus according to the present invention.

Fig. 4 is a diagram of the cutting effect of the image to be recognized and the sub-image provided by the present invention.

Fig. 5 is a schematic diagram of a face image area provided by the present invention.

Detailed Description

The invention is further described with reference to the following figures and specific embodiments. It should be noted that the following examples are provided to aid understanding of the present invention, but are not intended to limit the present invention. Specific structural and functional details disclosed herein are merely illustrative of example embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.

It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments of the present invention.

It should be understood that, for the term "and/or" as may appear herein, it is merely an associative relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, B exists alone, and A and B exist at the same time; for the term "/and" as may appear herein, which describes another associative object relationship, it means that two relationships may exist, e.g., a/and B, may mean: a exists independently, and A and B exist independently; in addition, for the character "/" that may appear herein, it generally means that the former and latter associated objects are in an "or" relationship.

It will be understood that when an element is referred to herein as being "connected," "connected," or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Conversely, if a unit is referred to herein as being "directly connected" or "directly coupled" to another unit, it is intended that no intervening units are present. In addition, other words used to describe the relationship between elements should be interpreted in a similar manner (e.g., "between … …" versus "directly between … …", "adjacent" versus "directly adjacent", etc.).

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, numbers, steps, operations, elements, components, and/or groups thereof.

It should also be noted that, in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed substantially concurrently, or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

It should be understood that specific details are provided in the following description to facilitate a thorough understanding of example embodiments. However, it will be understood by those of ordinary skill in the art that the example embodiments may be practiced without these specific details. For example, systems may be shown in block diagrams in order not to obscure the examples in unnecessary detail. In other instances, well-known processes, structures and techniques may be shown without unnecessary detail in order to avoid obscuring the example embodiments.

Examples

As shown in fig. 1, the image-based person identification method provided in the first aspect of this embodiment is suitable for person identification in an image after encoding and compression, and can effectively deal with person identification in a picture in which human face target information is lost due to transmission and compression, that is, can satisfy person identification of an image crawled through a web page in real life.

The image-based person identification method provided in the present embodiment may include, but is not limited to, the following steps S101 to S105.

S101, obtaining an image to be identified.

Step S101 is a process for acquiring an identification image; in this embodiment, the acquisition of the image to be recognized may be exemplified by, but not limited to: the method comprises the steps of directly crawling from a webpage in a network crawling mode, directly obtaining from an image database and/or directly uploading by a worker.

And S102, carrying out image cutting on the image to be identified to obtain a plurality of sub-images.

Step S102 is a process of image segmentation for the image to be recognized, so as to provide an image basis for extracting edge information of a subsequent image.

In this embodiment, for example, the size of each sub-image is the same, and the number may be, but is not limited to: 9 or 16 sheets.

In this embodiment, the example image cutting may adopt, but is not limited to: a python opencv script; namely, a python opencv script is used for continuously segmenting a small region of an image to be identified, and then the segmented region is stored as a new small picture, and the stored small picture is a sub-image.

For example, assuming that the width and height of each sub-image to be cut are w and h, respectively, the top left corner coordinate and the bottom right corner coordinate of the row (counted from 0) in one sub-image are (col x w, row x h) and (col w + w, row h + h), respectively, the whole sub-image can be represented by the top left corner coordinate and the bottom right corner coordinate, and finally the image to be recognized can be cut into a plurality of identical sub-images by cutting in accordance with the coordinates.

As shown in fig. 4, (a) in fig. 4 is an image to be recognized, and (b) is a sub-image after image cutting, as can be seen from (b) in fig. 4, the sub-image can be cut into 9 sub-images with the same size.

And S103, extracting the face feature of each sub-image in the plurality of sub-images to obtain at least one face feature image.

Step S103 is a process of extracting the face feature of each sub-image, and since the image to be recognized is cut into a plurality of sub-images in step S102, the edge information in the image to be recognized can be extracted as much as possible by extracting the face features of the plurality of sub-images, thereby preventing loss of the face image information and enhancing the information expression of the image data.

In this embodiment, the example of extracting the face feature of each of the multiple sub-images to obtain at least one face feature map may include, but is not limited to, the following steps S103a to S103c.

S103a, screening each sub-image, removing the sub-images without the human face, and obtaining at least one screened sub-image.

Step S103a is a process of sub-image screening, and since the image to be recognized is cut into a plurality of sub-images in step S102, and only the sub-images containing the face are needed to be extracted subsequently when the face features are extracted, the sub-images not containing the face need to be screened out, thereby avoiding affecting the extraction of the subsequent face features, and simultaneously avoiding the sub-images not containing the face from occupying resources, and reducing the extraction speed.

S103b, performing first convolution processing on each screened sub-image in the at least one screened sub-image to obtain a face position feature map corresponding to each screened sub-image.

And S103c, extracting a face image area in the face position feature map to obtain the at least one face feature map.

Step S103b and step S103c are processes for extracting human face features, and in this embodiment, for example, the extraction of the human face features may be, but is not limited to, the following: yolov3 predictive model.

The yolov3 prediction model contains 106 convolutional layers, can perform feature extraction by using a multi-scale feature map and can extract features with finer granularity, and the yolov3 prediction model has three scales, namely 1/32, 1/16 and 1/8.

In this embodiment, the feature extraction is performed using the size of 1/8, and the steps performed are: for each screened sub-image, after inputting it into yolov3 prediction model, convolving the 79 th convolutional layer to obtain 1/32 feature map, then performing channel fusion on the feature map with size 1/32 and the feature map obtained by the 61 th convolutional layer to obtain a first feature map, performing convolution operation on the first feature map to obtain the feature map with size 1/16 (for example, using convolution kernel with size 3, that is, convolution scale size 3 x 3, step size 1, convolution layer depth 32 to perform convolution to obtain feature map with size 1/16), then performing channel fusion on the feature map with size 1/16 and the feature map of the 36 th layer to obtain a second feature map, and then performing convolution processing (for example, using convolution kernel with same size 3, that is, convolution scale size 3 x 3, step size is 1, convolution depth is 16), and a feature map under 1/8 scale (namely a face feature map) can be obtained.

In this embodiment, step S103b and step S103c are both implemented by a yolov3 prediction model, that is, in the convolution process described above, a face position feature map can be obtained, and then a face image region is extracted based on the face position feature map to obtain a face feature map.

In this embodiment, the extraction of the face image region may be, but is not limited to: selecting 4 coordinate points by using pixel coordinate points in the face position feature map, wherein a region formed by the 4 coordinate points is a face image region; as shown in fig. 5, fig. 5 is a schematic diagram showing a face image region, i.e., a region outlined in a rectangular frame in fig. 5.

Through the steps, the face features of the screened sub-images can be extracted to obtain a face feature image, and image data is provided for the subsequent identification step.

And S104, performing resolution enhancement processing on each face feature image in the at least one face feature image to obtain a high-resolution face feature image corresponding to each face feature image.

Step S104 is to perform resolution increasing processing on each face feature map, that is, to increase the image definition, so as to provide excellent image preprocessing data for face recognition in the subsequent steps and prompt the recognition accuracy.

In this embodiment, for example, resolution enhancement processing is performed on each face feature map to obtain a high-resolution face feature map corresponding to each face feature map, which may include, but is not limited to, the following steps S104a to S104b.

S104a, enhancing the image contrast frequency domain range of each face feature image to obtain a contrast enhanced face feature image corresponding to each face feature image.

S104b, performing super-resolution reconstruction processing on each contrast enhancement face feature image in the contrast enhancement face feature images to obtain a high-resolution face feature image corresponding to each face feature image.

In this embodiment, resolution enhancement processing is performed on the face feature map, which mainly includes two-aspect enhancement, where the enhancement processing of the image contrast frequency domain range is to perform enhancement of the local resolution of the image, so as to ensure that the whole face feature map has higher resolution; the super-resolution reconstruction processing aims at enhancing the spatial information of the image area pixels in the face feature map, and can enable the pixel area of the face five sense organs information in the face feature map to be clearer. Through the design, the resolution of the image can be greatly improved, excellent image preprocessing data are provided for subsequent face recognition, and the recognition precision is further improved.

In this embodiment, the enhancement processing of the image contrast frequency domain range is performed on each facial feature image, but not limited to the following steps S104a1 to S104a4.

The principle of steps S104a 1-S104 a4 is described below, that is, in this embodiment, steps S104a 1-S104 a4 are: an image adaptive Contrast increasing algorithm, namely an ACE (adaptive Contrast enhancement) algorithm, has the principle that a picture is divided into two parts, namely a low-frequency part, and can be obtained through smooth blurring (namely low-pass filtering) of the picture; the second is a high frequency part which can be obtained by subtracting a low frequency part from the original image. Therefore, the ACE algorithm is to add a high frequency part representing the details of the image to a low frequency part, i.e. multiply the high frequency part by a certain gain value, and then add the high frequency part to the low frequency part to obtain an image with the details added, and the gain value can be expressed as an amount related to the variance of the image, so steps S104a1 to S104a4 are performed to obtain the high frequency part of the image, and then multiply the gain value and add the low frequency part to obtain the image with the details added.

S104a1, selecting any pixel point in each face feature map, and obtaining a local area which takes the pixel point as a center and has the size of (2n +1) × (2n +1), wherein n is an integer.

S104a2, performing image low-frequency processing on the local area to obtain a low-frequency local area.

Step S104a1 and step S104a2 are processes for obtaining low-frequency local regions in the face feature map.

Taking a face feature map as an example:

firstly, any pixel point is selected in a graph and is used as a central point, then a local area with the size of (2n +1) × (2n +1) is selected, and the local area is used as an enhancement area.

Then, the local area is processed with low frequency, i.e. the low frequency part of the local area is obtained, so as to provide a data base for obtaining the subsequent high frequency part.

In this embodiment, the pixel coordinate of the selected pixel point is (i, j), and the image low-frequency processing may use, but is not limited to, the following formula:

in the formula, m _x (i, j) is the low frequency part of the local region, and (k, l) is the start coordinate position of the local region.

And m is _x (i, j) can then be considered as a background portion of the local region.

At this time, the local variance of the local region also needs to be obtained, which may be, but is not limited to, the following manner:

in the formula (I), the compound is shown in the specification,

then the local variance of the local region is represented, and x (k, l) -m _x (i, j) represents the high frequency part of the local region. The ACE algorithm can therefore be seen as a smoothing and local correction method, i.e. local correction of the resulting low frequency local region in order to obtain an image with increased detail.

After the low frequency local region is obtained, the low frequency local region may be locally corrected, that is, the above proposed method obtains the low frequency part, then obtains the high frequency part, and finally multiplies the high frequency part by the gain value, and adds the gain value to obtain the detail-enhanced image, i.e., step S104a3.

S104a3, carrying out local correction on the low-frequency local area to obtain a low-frequency local area with enhanced contrast.

In this embodiment, the essence of step S104a3 is still the low frequency part, and only the high frequency part is added to the low frequency part, so as to increase the details of the image (i.e. adding the high frequency part is to increase the details).

Since the low frequency part of the local region is obtained and the high frequency part of the local region is obtained by using the variance, the high frequency part is added to the low frequency part and is locally corrected, so that the image with the details increased, that is, the low frequency local region with the contrast increased can be obtained.

In this embodiment, the local modification formula may be, but is not limited to:

f(i,j)＝m _x (i,j)+G(i,j)[x(i,j)-m _x (i,j)]

where f (i, j) denotes a local region after increasing the detail, and G (i, j) denotes a threshold value (in the present embodiment, it is a constant equal to or greater than 1) for high-frequency partial image segmentation of the local region.

By the algorithm and the formula, the high-frequency part of the local area of the selected pixel point can be reinforced, and the details of the local area are increased, so that the contrast is improved, and the resolution is enhanced.

Finally, the detail increase of the whole face feature image can be completed only by repeating the above operation for each pixel point in the face feature image (i.e., steps S104a 1-S104 a3), so as to realize the contrast increase, as shown in step S104a4.

S104a4, selecting a next pixel point, repeating the steps until each pixel point in each face feature image is selected, and obtaining a contrast enhanced face feature image corresponding to each face feature image.

Through the steps S104a 1-S104 a4, the contrast of the human face feature image can be increased, and the resolution of the whole image can be improved.

After the contrast enhancement process is completed, in the present embodiment, the addition of image space information, i.e. the enhancement of the definition of the facial features in the image, can also be performed, so as to further improve the recognition accuracy, i.e. the super-resolution reconstruction disclosed in the table of step S104.

In this embodiment, the super-resolution reconstruction processing is performed on each of the contrast-enhanced face feature maps, but the super-resolution reconstruction processing is not limited to the following steps S104b1 to S104b3.

S104b1, performing channel number expansion on each contrast enhancement face feature image in the contrast enhancement face feature images to obtain a multi-channel face feature image corresponding to each contrast enhancement face feature image.

S104b2, performing second convolution processing on each multi-channel face feature image in the multi-channel face feature images to obtain pixel face feature images of each multi-channel face feature image.

S104b3, rearranging the pixels of each pixel face feature image in the pixel face feature image according to the spatial distribution format of the pixels in the corresponding sub-image to obtain the high-resolution face feature image.

In this embodiment, the steps S104b 1-S104 b3 are implemented based on the sub-pixel convolution principle, which is also called pixel shuffling, and the principle is as follows: firstly, expanding the number of channels of an image (the expansion of the number of channels is essentially the amplification of the image), then carrying out convolution operation on a feature map corresponding to the number of channels after the expansion (namely, carrying out convolution on pixels, completing the fusion of the pixels, and fusing the pixels of multiple channels into an image), and finally rearranging the pixels (namely, representing the recovery of the feature pixels), thereby obtaining a large image with high resolution.

In this embodiment, for example, performing convolution operation on the feature map corresponding to the number of channels after the expansion may be, but is not limited to: the depth of the input sub-pixel convolution network characteristic layer is 256, namely, one-time 1/9-scale downsampling convolution is carried out when 255 layers of convolution layers are formed, the size of a convolution kernel is set to be 3, namely, the convolution scale is 3 x 3, the step length is 1, the convolution is carried out, simultaneously all pixels of all the expanded channel images are subjected to pixel fitting, pixel fusion can be completed, and the pixels of multiple channels are fused into one image.

That is, step S104b1 is a process of expanding the number of channels; assuming that the original contrast-enhanced face feature map has 3 channels, if the original image is amplified by 3 times, a feature map with 3^2 ^ 9 needs to be generated, that is, the number of channels needs to be expanded to 9, so as to provide an image base for subsequent pixel fusion.

Step S104b2 is a process of performing pixel convolution, that is, the second convolution operation is substantially to perform pixel convolution on 9 feature maps corresponding to 9 channels, so that pixels in the 9 feature maps are fused together to form a large map enlarged by 3 times.

Finally, step S104b3 is a process of rearranging pixels, and since the fusion of pixels has already been performed in step S104b2, step S104b3 only needs to install the spatial distribution format of the pixels in the original image (i.e. the sub-image corresponding to the original image) and rearrange the pixels, so as to obtain a face image area enlarged compared with the original image; according to the foregoing example, a face image area enlarged by 3 times, that is, a high-resolution face feature map, is obtained.

Therefore, through the steps S104a, S104b and the sub-steps included in the steps S104 8932, the local contrast of the face feature map and the definition of the five sense organs of the face can be enhanced, excellent image preprocessing data are provided for the subsequent face recognition, and the recognition accuracy is improved.

After the high-resolution face feature map is obtained, step S105 may be performed, that is, face recognition is performed to complete person recognition.

And S105, carrying out face recognition on each high-resolution face feature image in the high-resolution face feature images to obtain a face recognition result.

Step S105 is a process of performing image recognition, that is, a process of performing face recognition on each high-resolution face feature map, and since the edge information of the image is extracted as much as possible and the resolution is enhanced in the above steps S103 and S104, the image can be recognized with high accuracy in step S105, and an image whose image information is lost due to transmission compression can be effectively dealt with.

In this embodiment, for example, the step S105 is performed to perform face recognition, but not limited to the following steps S105a to S105b.

And S105a, performing face image side face correction on each high-resolution face characteristic image to obtain a face correction characteristic image corresponding to each high-resolution face characteristic image.

And S105b, carrying out face recognition on each face correction feature image in the face correction feature images to obtain a face recognition result.

In this embodiment, since the inside face of the crawled image is not necessarily a front face, and there may be inclinations at different angles, a side face correction is also required to reduce the recognition error and improve the recognition accuracy.

In the present embodiment, a trained Generated Antagonistic Network (GAN) is used to correct the high resolution face feature map; the principle is as follows: defining a face offset radian, inputting the high-resolution face feature map into a GAN network, performing convolution by using a convolution layer of the GAN network, activating a discriminator in the GAN network by using an activation function, and expressing an effect increment of the convolution by using a loss function (in a general case, the larger the face image offset radian is, the higher the effect increment is); meanwhile, when the GAN network is subjected to convolution, an internal self-encoder is adopted to construct a generator; and after obtaining the loss function, inputting the loss function into a judger and a generator, then obtaining two groups of loss function values, wherein the two groups of loss function values are used for cross verification, and finally, testing the correction effect by utilizing the verification result and through a GAN network, fitting the front face image characteristics corresponding to the input high-resolution face characteristic diagram during each convolution iteration, so that the input high-resolution face characteristic diagram can be matched with the front face image characteristics, further realizing the correction of the side face, and reducing the identification error.

Through the design, the correction of the human face side face in the image can be realized, so that the recognition error is reduced, and the recognition precision is improved.

After obtaining the face-corrected feature map, face recognition may be performed, but in this embodiment, the face recognition on the face gradient feature map may be, but is not limited to, the following steps S105b1 to S105b3.

And S105b1, performing image comparison on the face in each face correction characteristic image and the label face in the face database to obtain an image comparison result.

S105b2, judging whether the image comparison result meets a preset threshold value.

And S105b3, if yes, taking the label face corresponding to the image comparison result which meets a preset threshold value as the face recognition result.

In this embodiment, the face recognition is to perform image comparison between the face in each face correction feature map and the labeled face in the face database, so as to obtain a recognition result. That is, step S105b1 is a process of performing image comparison, and step S105b2 is a process of performing judgment.

In the present embodiment, the obtained image contrast result is a contrast value, which is calculated by using the following formula:

Thresh＝TP/(TP+FP)*100％

in the formula, Thresh is the image comparison result, TP is the number of samples of the pixel points in the input face image that the current label face conforms to, and FP is the number of samples of the pixel points in the input face image that the current label face does not conform to. If the calculated Thresh is larger than 0.5, the current label face is considered to be matched with the input face, the output can be carried out, and the result is a character recognition result, otherwise, the result is not used as the character recognition result.

In this embodiment, the tagged faces in the face database may be, but are not limited to being, pre-stored.

Through the person identification method based on the images, which is described in detail in the steps S101 to S105, the image of the image to be identified is segmented, the face characteristic of each sub-image obtained by segmentation is extracted, and the image edge information can be fully extracted, so that the loss of the face information is avoided, and meanwhile, the resolution enhancement processing is performed on the extracted face characteristic image, so that the resolution of the image can be increased, and the identification precision is increased; therefore, the invention can effectively deal with the character recognition of the low-resolution image which causes the loss of the picture information after the coding compression, thereby completely satisfying the character recognition of the webpage crawling picture.

As shown in fig. 2, a second aspect of the present embodiment provides a hardware apparatus for implementing the image-based person identification method in the first aspect of the embodiment, including: the system comprises an acquisition unit, an image cutting unit, a human face feature extraction unit, a resolution enhancement unit and a human face recognition unit.

The acquisition unit is used for acquiring the image to be identified.

The image cutting unit is used for carrying out image cutting on the image to be identified to obtain a plurality of sub-images.

The face feature extraction unit is used for extracting the face feature of each sub-image in the plurality of sub-images to obtain at least one face feature image.

And the resolution enhancement unit is used for performing resolution enhancement processing on each face feature image in the at least one face feature image to obtain a high-resolution face feature image corresponding to each face feature image.

In one possible design, the face feature extraction unit includes: the system comprises a screening subunit, a first convolution processing subunit and a face feature image extracting subunit.

And the screening subunit is used for screening each sub-image, removing the sub-images without the human face and obtaining at least one screened sub-image.

The first convolution processing subunit is configured to perform first convolution processing on each filtered sub-image of the at least one filtered sub-image, so as to obtain a face position feature map corresponding to each filtered sub-image.

In one possible design, the resolution enhancement unit includes: a contrast frequency domain range enhancer unit and a super-resolution reconstruction subunit.

And the contrast frequency domain range increasing subunit is used for performing enhancement processing on the image contrast frequency domain range on each human face feature image to obtain a contrast enhanced human face feature image corresponding to each human face feature image.

In one possible design;

the contrast frequency domain range increasing subunit is specifically configured to select any pixel point in each face feature map, and obtain a local area which is centered on the pixel point and has a size of (2n +1) × (2n +1), where n is an integer.

And the contrast frequency domain range increasing subunit is specifically used for performing image low-frequency processing on the local area to obtain a low-frequency local area.

And the contrast frequency domain range increasing subunit is specifically used for locally correcting the low-frequency local area to obtain a low-frequency local area with enhanced contrast.

In one possible design;

the super-resolution reconstruction subunit is specifically configured to perform channel number expansion on each contrast-enhanced face feature map in the contrast-enhanced face feature maps to obtain a multi-channel face feature map corresponding to each contrast-enhanced face feature map.

The super-resolution reconstruction subunit is specifically configured to perform second convolution processing on each of the multi-channel face feature maps to obtain a pixel face feature map of each of the multi-channel face feature maps.

In one possible design, the face recognition unit includes: a side face rectification subunit and an image identification subunit.

And the side face correction subunit is configured to perform side face correction on the face image of each high-resolution face feature map to obtain a face correction feature map corresponding to each high-resolution face feature map.

In one possible design;

the image identification subunit is specifically configured to perform image comparison on the face in each face correction feature image and the labeled face in the face database to obtain an image comparison result.

For the working process, the working details, and the technical effects of the hardware apparatus provided in this embodiment, reference may be made to the first aspect of the embodiment, which is not described herein again.

As shown in fig. 3, a third aspect of the present embodiment provides a second hardware device for implementing the image-based person identification method according to the first aspect of the present embodiment, including a memory, a processor and a transceiver, which are sequentially connected in communication, wherein the memory is used for storing a computer program, the transceiver is used for transceiving a message, and the processor is used for reading the computer program and executing the image-based person identification method according to the first aspect of the present embodiment.

For example, the Memory may include, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Flash Memory (Flash Memory), a First In First Out (FIFO), and/or a First In Last Out (FILO), and the like; the processor may not be limited to a microprocessor of a model number STM32F105 series, a reduced instruction set computer (RSIC) microprocessor, an architecture processor such as X86, or a processor integrated with a neural-Network Processing Unit (NPU); the transceiver may be, but is not limited to, a wireless fidelity (WIFI) wireless transceiver, a bluetooth wireless transceiver, a General Packet Radio Service (GPRS) wireless transceiver, a ZigBee wireless transceiver (ieee802.15.4 standard-based low power local area network protocol), a 3G transceiver, a 4G transceiver, and/or a 5G transceiver, etc. In addition, the device may also include, but is not limited to, a power module, a display screen, and other necessary components.

A fourth aspect of the present embodiment provides a computer-readable storage medium storing instructions that include the image-based person identification method according to the first aspect of the present embodiment, where the instructions are stored on the computer-readable storage medium, and when the instructions are executed on a computer, the image-based person identification method according to the first aspect of the present invention is executed. The computer-readable storage medium refers to a carrier for storing data, and may include, but is not limited to, floppy disks, optical disks, hard disks, flash memories, flash disks and/or Memory sticks (Memory sticks), etc., and the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.

For the working process, the working details, and the technical effects of the computer-readable storage medium provided in this embodiment, reference may be made to the first aspect of the embodiment, which is not described herein again.

A fifth aspect of the present embodiments provides a computer program product comprising instructions which, when run on a computer, which may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus, cause the computer to perform the image based person identification method according to the first aspect of the embodiments.

The embodiments described above are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device to perform the methods described in the embodiments or some portions of the embodiments.

The invention is not limited to the above alternative embodiments, and any other various forms of products can be obtained by anyone in the light of the present invention, but any changes in shape or structure thereof, which fall within the scope of the present invention as defined in the claims, fall within the scope of the present invention.

Claims

1. An image-based person identification method, comprising:

acquiring an image to be identified;

carrying out face recognition on each high-resolution face feature image in the high-resolution face feature images to obtain a face recognition result;

performing resolution enhancement processing on each face feature image in the at least one face feature image to obtain a high-resolution face feature image corresponding to each face feature image, wherein the resolution enhancement processing comprises the following steps:

performing super-resolution reconstruction processing on each contrast enhanced face feature image in the contrast enhanced face feature images to obtain high-resolution face feature images corresponding to each face feature image;

carrying out enhancement processing of image contrast frequency domain range on each face feature image to obtain a contrast enhanced face feature image corresponding to each face feature image, wherein the enhancement processing comprises the following steps:

selecting any pixel point in each face feature map, and obtaining a local region which takes the pixel point as the center and has the size of (2n +1) × (2n +1), wherein n is an integer;

selecting a next pixel point, and repeating the steps until each pixel point in each face feature image is selected completely, so as to obtain a contrast enhanced face feature image corresponding to each face feature image;

performing super-resolution reconstruction processing on each contrast-enhanced face feature map in the contrast-enhanced face feature maps to obtain a high-resolution face feature map corresponding to each face feature map, including:

rearranging the pixels of each pixel face feature image in the pixel face feature image according to the spatial distribution format of the pixels in the corresponding sub-image to obtain the high-resolution face feature image;

and performing second convolution processing on each multi-channel face feature image in the multi-channel face feature images as follows:

and performing 1/9-scale downsampling convolution on each multi-channel face feature image on 255 convolution layers of the sub-pixel convolution network once, wherein the size of a convolution kernel used is 3, and the step size is 1.

2. The method of claim 1, wherein performing face feature extraction on each of the plurality of sub-images to obtain at least one face feature map comprises:

3. The method of claim 1, wherein performing face recognition on each of the high-resolution face feature maps to obtain a face recognition result comprises:

4. The method of claim 3, wherein performing face recognition on each of the face correction feature maps to obtain the face recognition result comprises:

and if so, taking the label face corresponding to the image comparison result which meets a preset threshold value as the face recognition result.

5. An image-based person recognition apparatus, comprising: the system comprises an acquisition unit, an image cutting unit, a human face feature extraction unit, a resolution enhancement unit and a human face recognition unit;

the acquisition unit is used for acquiring an image to be identified;

the face recognition unit is used for carrying out face recognition on each high-resolution face feature image in the high-resolution face feature images to obtain a face recognition result;

the resolution enhancement unit includes: a contrast frequency domain range enhancer unit and a super-resolution reconstruction subunit;

the super-resolution reconstruction subunit is configured to perform super-resolution reconstruction processing on each of the contrast-enhanced face feature maps to obtain a high-resolution face feature map corresponding to each of the face feature maps;

the contrast frequency domain range increasing subunit is specifically configured to select any pixel point in each face feature map, and obtain a local region which is centered at the pixel point and has a size of (2n +1) × (2n +1), where n is an integer;

the contrast frequency domain range increasing subunit is further specifically used for selecting a next pixel point, and repeating the steps until each pixel point in each face feature map is selected completely, so as to obtain a contrast enhanced face feature map corresponding to each face feature map;

the super-resolution reconstruction subunit is further specifically configured to rearrange the pixels of each of the pixel face feature maps in the pixel face feature map according to the spatial distribution format of the pixels in the corresponding sub-image, so as to obtain the high-resolution face feature map.

6. An image-based person recognition apparatus, comprising: a memory, a processor and a transceiver, wherein the memory is used for storing computer programs, the transceiver is used for transmitting and receiving messages, and the processor is used for reading the computer programs and executing the image-based person identification method according to any one of claims 1 to 4.

7. A computer-readable storage medium having stored thereon instructions for performing the image-based person recognition method according to any one of claims 1 to 4 when the instructions are run on a computer.