CN114093004B

CN114093004B - Face fusion comparison method and device based on multiple cameras

Info

Publication number: CN114093004B
Application number: CN202111415185.9A
Authority: CN
Inventors: 张利; 拜正斌; 严军; 李阳; 刘浩; 唐波; 欧华平; 赵玲; 饶龙强; 王文琪
Original assignee: Chengdu Zhiyuanhui Information Technology Co Ltd
Current assignee: Chengdu Zhiyuanhui Information Technology Co Ltd
Priority date: 2021-11-25
Filing date: 2021-11-25
Publication date: 2023-05-02
Anticipated expiration: 2041-11-25
Also published as: CN114093004A

Abstract

The invention discloses a face fusion comparison method and device based on multiple cameras, which specifically comprise the following steps: s1, video data stream acquisition: acquiring video stream data of a plurality of groups of cameras; s2, extracting pictures: when the same time T is acquired, a plurality of groups of RGB pictures corresponding to the plurality of groups of video stream data are acquired; s3, face extraction: extracting faces of each group of RGB pictures to obtain face pictures, calculating the face occupation ratio of each face picture, and setting the face picture with the largest face occupation ratio as an effective face picture of each group of RGB pictures; s4, picture screening, namely obtaining the effective face pictures of the RGB pictures marked as effective data through a picture screening algorithm on the effective face pictures of all groups of RGB pictures; s5, uploading: and sending the effective face picture marked as the RGB picture of the effective data to a face comparison server. The invention uses the fusion contrast technology, effectively improves the efficiency of face image acquisition, greatly improves the recognition rate of the face and has great practical value.

Description

Face fusion comparison method and device based on multiple cameras

Technical Field

The invention relates to the technical field of smart cities, in particular to a face fusion comparison method and device based on multiple cameras.

Background

As the security field develops rapidly, the functions of cameras are increasingly powerful, the existing cameras have generally been provided with communication protocols, and wired and wireless remote video reading can be realized. Meanwhile, as the safety requirement rises, more and more cameras are installed in places such as traffic and streets to play a role in monitoring. Therefore, people can be identified through the cameras, and efficiency is improved by processing a plurality of cameras in real time.

Face recognition and pedestrian re-recognition are key technologies for identifying specific pedestrians. However, since the current effective method mostly adopts the deep learning neural network, the method occupies a large memory, and simultaneously has more calculation amount, and is difficult to process for the scene monitored in real time. In the existing face recognition technology, face characteristics of a given image or video sequence are extracted and compared with face library data to recognize a user, but the traditional face recognition system is easily affected by various external conditions such as illumination, beards, glasses, hairstyles, expressions and the like, so that the recognition rate is reduced.

Disclosure of Invention

The invention aims to provide a face fusion comparison method based on multiple cameras, which is used for solving the problem that samples in the existing face recognition technology are easily influenced by external conditions to reduce face recognition efficiency by receiving video stream data of the multiple cameras and performing living body detection, screening and sorting and quality evaluation.

A face fusion comparison method based on multiple cameras specifically comprises the following steps:

s1, video data stream acquisition: acquiring video stream data of a plurality of groups of cameras, and storing the video stream data in groups according to the IDs of the cameras;

s2, extracting pictures: when the same time T is acquired, a plurality of groups of RGB pictures corresponding to the plurality of groups of video stream data are acquired;

s3, face extraction: extracting faces of each group of RGB pictures to obtain face pictures, calculating the face occupation ratio of each face picture, and setting the face picture with the largest face occupation ratio as an effective face picture of each group of RGB pictures;

s4, picture screening, namely obtaining effective face pictures of RGB pictures marked as effective data by using effective face pictures of all groups of RGB pictures through a picture screening algorithm;

s5, uploading: and sending the effective face picture marked as the RGB picture of the effective data to a face comparison server.

Further, the picture filtering algorithm in the step S4 is at least one of the following algorithms:

1. comparison fusion algorithm: performing similarity comparison on the effective face pictures of each two groups of RGB pictures by a cyclic comparison method to obtain corresponding comparison scores;

marking the effective face pictures of the two groups of RGB pictures with the highest comparison scores as effective data;

2. screening and sorting algorithm:

similarity value: obtaining a similarity value of effective face pictures of each two groups of RGB pictures;

screening, namely respectively comparing the plurality of similar values with a preset threshold value, and discarding similar values lower than the preset threshold value;

sequencing: sorting the rest similarity values after discarding according to the score value, and reserving the effective face pictures of the two groups of RGB pictures corresponding to the highest similarity value in the sorting;

3. a picture quality assessment algorithm; and carrying out picture quality evaluation on the effective face pictures of each group of RGB pictures, and marking the effective face pictures of the RGB pictures meeting the picture quality evaluation as effective data.

Further, the picture screening algorithm is an algorithm composed of one or more of a comparison fusion algorithm, a screening and sorting algorithm and a picture quality evaluation algorithm according to different sequences.

Further, the image screening algorithm is an algorithm composed of a comparison fusion algorithm, a screening and sorting algorithm and an image quality evaluation algorithm in sequence, and the step S4 specifically includes the following steps:

SA, alignment fusion: performing similarity comparison on the effective face pictures of each two groups of RGB pictures by a cyclic comparison method to obtain corresponding comparison scores;

SB, screening and sequencing:

screening, namely respectively comparing the plurality of comparison scores with a preset threshold value, and discarding the comparison scores lower than the preset threshold value;

sequencing: sorting the remaining comparison scores after discarding according to the score size, and reserving effective face pictures of two groups of RGB pictures corresponding to the highest comparison score in the sorting;

SC, picture quality assessment: performing picture quality evaluation on the effective face pictures of the two groups of RGB pictures, and marking the effective face pictures of the RGB pictures meeting the picture quality evaluation as effective data;

further, if the plurality of groups of cameras include one or more groups of binocular cameras, the binocular cameras are respectively an RGB camera and an IR camera, and step S2 further includes a step for living body identification:

s001: identifying video stream data of the binocular cameras, and calling RGB pictures at the moment T corresponding to the group of cameras;

s002: detecting RGB pictures, and recording the positions of the faces in the RGB pictures when the faces are detected;

s003: performing face detection at a position corresponding to the IR picture according to the position of the face in the RGB picture;

s004: if the face is detected, judging that the face in the current RGB picture is in a living body state, and reserving the RGB picture in the living body state;

s005: if no face is detected, judging that the face in the current RGB picture is in a non-living state, and discarding the RGB picture in the non-living state.

Further, step S3 further includes a step of determining the number of valid faces:

calculating the number of effective faces;

if the number of the effective faces is more than or equal to 2, turning to step SA;

if the number of valid faces=1, go to step S5.

Further, the step SB further includes the step of discriminating the number of remaining comparison scores:

calculating the quantity of residual comparison scores;

if the quantity of the residual comparison scores is more than or equal to 1, turning to the step S6;

if the number of remaining comparison scores=0, the time t=t+1 is shifted to step S2.

Further, the picture quality evaluation is to perform quality detection judgment based on a plurality of indexes and effective ranges, wherein the indexes and the effective ranges include:

index A: image blur degree, effective range (0.1, 1];

index B: the face pitching angle, the effective range is [ -20 degrees, +20 degrees ];

index C: mask ratio of human face, effective range (0.9,1) of mouth and nose C1, effective range (0.5, 1) of eye C2;

index D: the degree of opening and closing of the mouth, the effective range (0.1, 1);

and marking the RGB pictures which simultaneously meet the effective ranges of the four indexes of the image ambiguity A, the face pitching angle B, the face mask proportion C and the mouth opening and closing degree D as effective data.

Further, the sheet quality evaluation is to make a judgment of quality detection based on a plurality of indices including:

index A: image blur degree;

index B: face pitch angle;

index C: mask proportion of human face, mouth and nose C1; an eye C2;

index D: degree of opening and closing of the mouth;

the quality evaluation score is obtained by respectively giving corresponding weights wi to the image ambiguity A, the face pitching angle B, the face mask proportion C and the mouth opening and closing degree D, and substituting the weights wi into a formula:

scorei＝w1*A+w2*B+w3*(C1+C2)+w4*D；

further, in step S4, similarity comparison of the effective faces is performed by calculating euclidean distance or cosine distance of the two groups of effective faces, where the effective faces are face portions with the largest face occupation ratio in the RGB picture.

A face fusion comparison device based on multiple cameras comprises:

a memory;

one or more processors; and

one or more modules stored in memory and configured to be executed by the one or more processors, the one or more modules comprising:

a memory;

one or more processors; and

video data stream acquisition module: the method comprises the steps of obtaining video stream data of a plurality of groups of cameras, and storing the video stream data in groups according to the IDs of the cameras;

and a picture extraction module: when the video stream data are used for acquiring the same time T, a plurality of groups of RGB pictures corresponding to the plurality of groups of video stream data;

face extraction module: the face extraction module is used for extracting faces of each group of RGB pictures to obtain face pictures, calculating the face occupation ratio of each face picture, and setting the face picture with the largest face occupation ratio as an effective face picture of each group of RGB pictures;

and a picture screening module: the method comprises the steps that effective face pictures of all groups of RGB pictures are obtained through a picture screening algorithm, and the effective face pictures of the RGB pictures marked as effective data are obtained;

and an uploading module: and the effective face picture is used for sending the RGB picture marked as effective data to the face comparison server.

Further, the cameras are binocular cameras and/or monocular cameras which are arranged at a plurality of different angles in the security inspection area.

The invention has the beneficial effects that:

1. and the face tracking is performed by utilizing multiple cameras, and a plurality of target face images with different illumination, different postures, different ambiguity and the like can be acquired in the area of the monitored scene. Through living body detection, screening and sorting, quality assessment, preprocessing a plurality of face images with different quality, and selecting one face image with the best quality as effective data. The scheme effectively improves the efficiency of face image acquisition, greatly improves the recognition rate of the face and has great practical value;

2. therefore, the fusion contrast technology is used for parallel processing of the plurality of cameras, so that the video of the multi-source camera is ensured not to be affected successively, and higher instantaneity is ensured.

Drawings

FIG. 1 is a schematic diagram of a face fusion contrast method based on multiple cameras of the present invention;

FIG. 2 is a schematic diagram of a face fusion contrast device of a monocular camera according to the present invention;

FIG. 3 is a schematic diagram of a face fusion contrast device of a multi-camera of the present invention;

FIG. 4 is a schematic diagram of a face fusion contrast device with multiple multi-camera according to the present invention;

FIG. 5 is a schematic diagram of a face fusion contrast device with multiple cameras according to the present invention;

fig. 6 is a schematic diagram illustrating a composition of a picture screening module according to embodiment 4 of the present invention;

Detailed Description

The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.

In the description and drawings, particular implementations of embodiments of the invention are disclosed in detail as being indicative of some of the ways in which the principles of embodiments of the invention may be employed, but it is understood that the scope of the embodiments of the invention is not limited correspondingly. On the contrary, the embodiments of the invention include all alternatives, modifications and equivalents as may be included within the spirit and scope of the appended claims.

Example 1

An object of the present embodiment is to provide a face fusion comparison method based on multiple binocular cameras, including:

1. and (3) camera data stream acquisition: the video stream is extracted according to the modules by accessing a plurality of groups of binocular cameras, and each group is RGB (color camera) and IR (infrared camera). Dividing and storing according to the modules, wherein the number of the extracted video streams=2;

2. extracting pictures from video: at the same time, extracting pictures of two paths of videos of each group, including RGB pictures and IR pictures (or depth pictures), so as to obtain the number of picture groups = the number of modules, and the number of pictures = the number of cameras = the number of modules;

3. the packets were subjected to live detection: the pictures obtained in the step 2 are compared in a living body mode according to groups, whether living body attack exists or not is detected, and RGB pictures in the picture groups with successful living body identification are reserved;

4. extracting an image face: face extraction is carried out on the (living body) RGB picture in the step 3 to obtain a face picture, a plurality of face problems in the picture are determined through the face screen occupation ratio (the occupation ratio of the face picture in the whole screen), and the largest face occupation ratio is considered as an effective face in the picture;

5. RGB picture comparison fusion: and (3) a local face picture comparison library is built in, the effective face pictures of the RGB pictures in the step (4) are stored in the local face library, and the effective face pictures of the two groups of RGB pictures are obtained by mutual comparison.

In the step 4, the number of the extracted effective faces is n, and m face images are compared (in the scene, m=2, namely, two-by-two comparison), and the number of times to be compared in the algorithm follows a permutation and combination calculation formula as follows:

the number of all combinations of m (m.ltoreq.n) elements is taken out of n different elements, i.e. the number of combinations of m elements is taken out of n different elements. Represented by the symbol C (n, m):

/>

example 1: in the step 4, the effective face pictures extracted from the RGB pictures are A, B, C, after the face pictures are stored in the local library, A, B, C pieces of data are stored in the local face base, and at the moment, the following comparison is performed:

when: score > =threshold (e.g. 0.85, threshold adjustable), considered valid face data, reserved;

when: score < threshold (e.g., 0.85, threshold adjustable), considered invalid face data, discarded;

contrast relationship	Comparison score	Results
			Comparison of A with B	score1 (e.g. 0.92)	Effective data, reserve
Comparison of A with C	score2 (e.g. 0.94)	Effective data, reserve
			Comparison of B with C	score3 (e.g. 0.84)	Discarding

And (5) reserving a group of results with highest comparison scores, and obtaining a group of pictures A and C.

Contrast relationship	Comparison score	Results
			Comparison of A with B	score1 (e.g. 0.92)	Valid data, discard
Comparison of A with C	score2 (e.g. 0.94)	Effective data, reserve

Example 2: in the step 4, the number of effective face pictures extracted from the RGB pictures is A, C, after the face pictures are stored in the local library, A, C pieces of data are stored in the local face base, and at the moment, the following comparison is performed:

when: score > =threshold (e.g. 0.85, threshold adjustable), considered valid data, reserved; step 7 is entered;

contrast relationship	Comparison score	Results
			Comparison of A with C	score (e.g. 0.92)	Effective data, reserve

When score < threshold (e.g., 0.85, threshold is adjustable), invalid data is considered and discarded; ending the algorithm;

contrast relationship	Comparison score	Results
			Comparison of A with C	score (e.g. 0.84)	Discarding

Example 3: and 4, extracting 1 effective face image which is only the image A, and skipping the step 6 without comparison at the moment, and entering the step 7.

The score < threshold (for example, 0.85, the threshold is adjustable) is regarded as different people, so that the condition that different people are shot by multiple cameras or different people with different angles are regarded as different people (for short: false recognition) can be avoided, and effective data are finally determined. The face detection quantity can be improved in the above way, and the false recognition rate is effectively reduced;

6. detecting and optimizing picture quality: and (3) comparing the image quality of the effective face images of the two groups of RGB images obtained in the step (5), wherein the quality comparison comprises but is not limited to: the quality of the group of data is determined through the various indexes, such as image ambiguity, image face pitching angle, image face mask proportion, mouth opening and closing degree and the like, and the quality is optimal and is used as a final effective picture;

parameter description:

image blur degree, parameter value effective range (0.1, 1];

the human face pitching angle, the effective range of the parameter value is [ -20 degrees, +20 degrees ];

face mask ratio: an effective range of values for the oral and nasal parameters (0.9,1), an effective range of values for the ocular parameters (0.5, 1);

the degree of opening and closing of the mouth, and the effective range (0.1, 1) of parameter values.

Parameter relation: while satisfying the effective range, and is regarded as effective.

7. Comparing the faces of pictures: and (3) sending the face picture in the step (6) into a background face comparison service, realizing a final comparison result, and completing subsequent business processing.

Example 2

An object of the present embodiment is to provide a face fusion contrast method based on multiple monocular cameras, including:

1. and (3) camera data stream acquisition: the video stream is extracted according to the modules by accessing a plurality of groups of monocular cameras, and each group is divided into RGB (color camera) and IR (infrared camera). Dividing and storing according to the modules, wherein the number of extracted video streams=the number of modules;

2. extracting pictures from video: carrying out RGB picture extraction on the videos of each group at the same moment to obtain the number of picture groups = the number of modules, and the number of pictures = the number of modules;

3. extracting an image face: carrying out face extraction on the RGB picture in the step 2 to obtain a face picture, determining a plurality of face problems in the picture through the face screen ratio (the ratio of the face picture in the whole screen), and determining that the face with the largest screen ratio is an effective face in the picture;

4. RGB picture comparison fusion: and (3) a local face picture comparison library is built in, the effective face pictures of the RGB pictures in the step (4) are stored in the local face library, and the effective face pictures of the two groups of RGB pictures are obtained by mutual comparison.

5. Detecting and optimizing picture quality: and (3) comparing the image quality of the effective face images of the two groups of RGB images obtained in the step (4), wherein the quality comparison comprises but is not limited to: the quality of the group of data is determined through the various indexes, such as image ambiguity, image face pitching angle, image face mask proportion, mouth opening and closing degree and the like, and the quality is optimal and is used as a final effective picture;

parameter description:

image blur degree, parameter value effective range (0.1, 1];

6. Comparing the faces of pictures: and (5) sending the face picture in the step (5) into a background face comparison service, realizing a final comparison result, and completing subsequent business processing.

Example 3

An object of the present embodiment is to provide a face fusion contrast method based on multiple cameras, where the cameras may be binocular camera die modules, support face living bodies, prevent face living body attack, and also may be degraded to monocular camera modules, and also may be configured in binocular and monocular free combination, including:

1. and (3) camera data stream acquisition: extracting video streams according to modules by accessing a plurality of groups of monocular cameras, wherein each group is divided into RGB (color camera) and IR (infrared camera) according to the modules for storage;

2. extracting pictures from video: extracting pictures from the videos of each group at the same time;

3. living body identification:

SA: identifying video stream data of the binocular cameras, and calling RGB pictures at the moment T corresponding to the group of cameras;

SB: detecting RGB pictures, and recording the positions of the faces in the RGB pictures when the faces are detected;

SC: performing face detection at a position corresponding to the IR picture according to the position of the face in the RGB picture;

SD: if the face is detected, judging that the face in the current RGB picture is in a living body state, and reserving the RGB picture in the living body state;

SE: if no face is detected, judging that the face in the current RGB picture is in a non-living state, and discarding the RGB picture in the non-living state.

4. Extracting an image face: carrying out face extraction on the RGB picture in the step 3 to obtain a face picture, determining a plurality of face problems in the picture through the face screen ratio (the ratio of the face picture in the whole screen), and determining that the face with the largest screen ratio is an effective face in the picture;

parameter description:

image blur degree, parameter value effective range (0.1, 1];

Example 4

An object of the present embodiment is to provide a face fusion comparison device based on multiple cameras, including:

a memory;

one or more processors; and

a memory;

one or more processors; and

The foregoing description of the preferred embodiment of the invention is not intended to limit the invention in any way, but rather to cover all modifications, equivalents, improvements and alternatives falling within the spirit and principles of the invention.

Claims

1. The face fusion comparison method based on the multiple cameras is characterized by comprising the following steps of:

s2, extracting pictures: when the same time T is obtained, a plurality of groups of RGB pictures corresponding to the video stream data respectively;

s3, face extraction: extracting the human face of each group of RGB pictures to obtain a plurality of human face pictures corresponding to each RGB picture, calculating the screen occupation ratio of the human face in each human face picture, and determining the human face picture with the largest screen occupation ratio as the effective human face picture of the corresponding RGB picture;

s4, picture screening, namely obtaining the effective face pictures of the RGB pictures marked as effective data through a picture screening algorithm on the effective face pictures of all groups of RGB pictures;

s5, uploading: transmitting the effective face picture marked as the RGB picture of the effective data to a face comparison server;

the picture screening algorithm is an algorithm which is composed of a comparison fusion algorithm, a screening and sorting algorithm and a picture quality evaluation algorithm in sequence, and the step S4 specifically comprises the following steps:

SB, screening and sequencing:

screening, namely respectively comparing a plurality of comparison scores with a preset threshold value, and discarding the comparison scores lower than the preset threshold value;

sequencing: sorting the rest comparison scores after discarding according to the score size, and reserving the effective face pictures of the RGB pictures corresponding to the two groups of the highest comparison scores in the sorting;

SC, picture quality assessment: and carrying out picture quality evaluation on the effective face pictures of the RGB pictures in the two groups, and marking the effective face pictures of the RGB pictures meeting the picture quality evaluation as effective data.

2. The multi-camera-based face fusion comparison method according to claim 1, wherein the image filtering algorithm in the step S4 may be at least one of the following algorithms:

2. screening and sorting algorithm:

similarity value: obtaining the similarity value of the effective face pictures of each two groups of RGB pictures;

screening, namely respectively comparing a plurality of similar values with a preset threshold value, and discarding similar values lower than the preset threshold value;

sequencing: sorting the rest similarity values after discarding according to the score value, and reserving the effective face pictures of the RGB pictures corresponding to the two groups of the highest similarity values in the sorting;

3. a picture quality assessment algorithm; and carrying out picture quality evaluation on the effective face pictures of the RGB pictures in each group, and marking the effective face pictures of the RGB pictures meeting the picture quality evaluation as effective data.

3. The method of claim 1, wherein if the plurality of groups of cameras include one or more groups of binocular cameras, the binocular cameras are RGB cameras and IR cameras, respectively, and step S2 further includes a step for living body recognition:

4. The method for comparing face fusion based on multiple cameras according to claim 1, wherein the step S3 further comprises the step of determining the number of valid faces:

calculating the number of effective faces;

if the number of valid faces=1, go to step S5.

5. The method for comparing face fusion based on multiple cameras as defined in claim 1, further comprising the step of determining the number of remaining comparison scores after step SB:

calculating the quantity of residual comparison scores;

if the quantity of the residual comparison scores is more than or equal to 1, the step SC is switched;

6. The method for comparing face fusion based on multiple cameras according to claim 2, wherein the picture quality evaluation is based on a plurality of indexes and effective ranges, and the quality detection judgment is performed, and the plurality of indexes and effective ranges include:

index A: image blur degree, effective range (0.1, 1];

7. The multi-camera-based face fusion comparison method according to claim 2, wherein the picture quality assessment is based on a plurality of indexes, and the plurality of indexes comprise:

index A: image blur degree;

index B: face pitch angle;

index C: mask proportion of human face, mouth and nose C1; an eye C2;

index D: degree of opening and closing of the mouth;

the quality evaluation score is obtained by respectively giving corresponding weights w1, w2, w3 and w4 to the image ambiguity A, the face pitching angle B, the face mask proportion C and the mouth opening and closing degree D, and substituting the weights into a formula:

scorei＝w1*A+w2*B+w3*(C1+C2)+w4*D。

8. the method for comparing face fusion based on multiple cameras according to claim 2, wherein in the step S4, the similarity comparison of the effective faces is performed by calculating the euclidean distance or the cosine distance between the two groups of effective faces, and the effective faces are face portions with the largest face occupation ratio in the RGB picture.

9. The utility model provides a face fuses comparison device based on many cameras which characterized in that includes:

a memory;

one or more processors; and

the video data stream acquisition module is used for acquiring video stream data of a plurality of groups of cameras and storing the video stream data in groups according to the IDs of the cameras;

the picture extraction module is used for acquiring a plurality of groups of RGB pictures corresponding to the video stream data respectively at the same time T;

the face extraction module is used for extracting the face of each group of RGB pictures to obtain a plurality of face pictures corresponding to each RGB picture, calculating the face occupation ratio of each face picture, and determining the face picture with the largest face occupation ratio as the effective face picture of the corresponding RGB picture;

the picture screening module is used for obtaining the effective face pictures of the RGB pictures marked as effective data through a picture screening algorithm;

the uploading module is used for sending the effective face picture of the RGB picture marked as effective data to a face comparison server;

SB, screening and sequencing:

sequencing: sorting the remaining comparison scores after discarding according to the score size, and reserving the effective face pictures of the two groups of RGB pictures corresponding to the highest comparison score in the sorting;

SC, picture quality assessment: and carrying out picture quality evaluation on the effective face pictures of the two groups of RGB pictures, and marking the effective face pictures of the RGB pictures meeting the picture quality evaluation as effective data.