CN114299596A

CN114299596A - Smart city face recognition matching method and system and cloud platform

Info

Publication number: CN114299596A
Application number: CN202210223565.0A
Authority: CN
Inventors: 杨翰翔; 赖晓俊
Original assignee: Shenzhen Lianhe Intelligent Technology Co ltd
Current assignee: Shenzhen Lianhe Intelligent Technology Co ltd
Priority date: 2022-03-09
Filing date: 2022-03-09
Publication date: 2022-04-08
Anticipated expiration: 2042-03-09
Also published as: CN114299596B

Abstract

The embodiment of the invention provides a smart city face recognition matching method, a smart city face recognition matching system and a cloud platform. So, can show improvement face identification matching efficiency to further promote the application effect of unmanned aerial vehicle aerial photography control.

Description

Smart city face recognition matching method and system and cloud platform

Technical Field

The invention relates to the technical field of smart city monitoring and unmanned aerial vehicles, in particular to a face recognition matching method and system for a smart city and a cloud platform.

Background

Unmanned Aerial Vehicles (UAVs) are also known as drones. With the rapid development of unmanned flight technology, consumer unmanned aerial vehicles are widely applied in various industries and used for replacing people to execute corresponding work.

Further, with the continuous acceleration of the progress of smart cities, the application of the unmanned aerial vehicle in the field of smart cities (such as smart city management) is also widely popularized. For example, unmanned aerial vehicle is used for various fields such as wisdom city environmental monitoring and commander, automatic food delivery, wisdom city commodity circulation, very big make things convenient for people daily work and life, makes the city become more and more "intellectuality" simultaneously.

In addition, unmanned aerial vehicle is based on unmanned aerial vehicle's crowd control in the important application field of wisdom city management, for example, under some specific occasion, utilize unmanned aerial vehicle to carry out the monitoring of taking photo by plane to the crowd of corresponding specific occasion to the monitoring image that takes photo by plane that obtains carries out the analysis according to taking photo by plane, in order to realize right the monitoring of taking photo by plane of specific occasion. For example, aerial photography monitoring can be performed on train stations, airports, urban squares with dense pedestrian flows, and the like by unmanned planes. For another example, in a special aerial photography monitoring application scenario, a target monitoring object in a monitored crowd can be identified by performing face feature recognition on an aerial photography image obtained by aerial photography monitoring of an unmanned aerial vehicle. However, in the current face recognition method based on aerial monitoring images, because the number of aerial monitoring images is huge, the face recognition process has the problems of low recognition efficiency and poor application effect.

Disclosure of Invention

In order to solve the above problem, in a first aspect, an embodiment of the present invention provides a face recognition matching method for a smart city, which is applied to a cloud platform of a face recognition matching system, where the face recognition matching system further includes an unmanned aerial vehicle that is in communication connection with the cloud platform and is used for respectively performing aerial photography monitoring on target monitoring occasions, and the method includes:

acquiring an aerial photography monitoring image obtained by aerial photography of a preset monitored urban area by the unmanned aerial vehicle, wherein the aerial photography monitoring image comprises a plurality of aerial photography image frames, and acquiring a pre-constructed human face feature set and a target human face database;

carrying out face identification matching on the aerial photography monitoring image and the face feature set to obtain a face key feature sequence comprising two or more pieces of face feature information;

and carrying out face recognition matching on the face key feature sequence and the target face database, and determining whether the aerial monitoring image comprises an aerial image frame matched with the face data in the target face database.

Based on a possible implementation manner of the first aspect, the performing face recognition matching on the aerial surveillance image and the face feature set to obtain a face key feature sequence including two or more pieces of face feature information includes:

matching and analyzing each aerial image frame in the aerial monitoring image with the face feature set respectively to obtain a matching result between each aerial image frame and the face feature set respectively;

according to the matching result corresponding to each aerial photography image frame and the face description of each aerial photography image frame, performing frame sequence adjustment on each aerial photography image frame, and obtaining an aerial photography image frame discrete queue according to each aerial photography image frame after frame sequence adjustment;

and obtaining a face key feature sequence related to the face feature set based on the aerial photography image frame discrete queue, wherein the face key feature sequence comprises two or more pieces of face feature information.

Based on a possible implementation manner of the first aspect, performing face recognition matching on the face key feature sequence and the target face database, and determining whether the aerial surveillance image includes an aerial image frame matched with the face data in the target face database includes:

selecting at least one piece of face feature information which is ranked at the front from the face key feature sequence as face feature information to be recognized according to the matching result of each piece of face feature information and the face feature set; carrying out face recognition matching on each piece of face feature information to be recognized and the target face database, and determining whether the aerial monitoring image comprises an aerial image frame matched with the face data in the target face database; or

And sequentially carrying out face recognition matching on the face feature information and the target face database according to the arrangement sequence of the face feature information in the face key feature sequence.

Based on a possible implementation manner of the first aspect, the performing frame sequence adjustment on each aerial image frame according to a matching result corresponding to each aerial image frame and a face description of each aerial image frame, and obtaining an aerial image frame discrete queue according to each aerial image frame after frame sequence adjustment includes:

classifying the aerial image frames according to the matching result corresponding to the aerial image frames and the face description of the aerial image frames to obtain two or more aerial image frame sets;

sequencing each aerial photography image frame set, and respectively carrying out frame sequence adjustment on each aerial photography image frame in each aerial photography image frame set to obtain the aerial photography image frame discrete queue;

the method for classifying the aerial image frames according to the matching result corresponding to each aerial image frame and the face description of each aerial image frame to obtain two or more aerial image frame sets comprises the following steps:

respectively performing feature fusion on the face description of each aerial image frame according to the matching result corresponding to each aerial image frame to obtain the key face feature description of each aerial image frame;

classifying the aerial image frames according to the key description of the human face characteristics of the aerial image frames to obtain two or more aerial image frame sets;

the sequence arrangement of each aerial image frame set and the frame sequence adjustment of each aerial image frame in each aerial image frame set are respectively carried out to obtain the aerial image frame discrete queue, and the method comprises the following steps:

according to the number of aerial image frames contained in each aerial image frame set, performing order arrangement on each aerial image frame set;

for each aerial image frame set, adjusting the frame sequence of each aerial image frame in the aerial image frame set according to the human face description of each aerial image frame in the aerial image frame set and the common quantitative index of the aerial image frame set;

and obtaining the aerial image frame discrete queue based on the frame set arrangement result among the aerial image frame sets and the frame sequence adjustment result of the aerial image frames in each aerial image frame set.

Based on a possible implementation manner of the first aspect, the performing matching analysis on each aerial image frame in the aerial surveillance image and the face feature set to obtain a matching result between each aerial image frame and the face feature set includes:

respectively inputting each aerial photography image frame into a human face key feature AI identification network obtained in advance, carrying out feature classification and identification on each aerial photography image frame based on a feature identification matching unit based on global feature dimension in the human face key feature AI identification network obtained in advance, carrying out feature matching on image classification features obtained by feature classification and identification and the human face feature set respectively, and then outputting a matching result corresponding to each aerial photography image frame;

the frame sequence adjustment is performed on each aerial image frame according to the matching result corresponding to each aerial image frame and the face description of each aerial image frame, and the aerial image frame discrete queue is obtained according to each aerial image frame after the frame sequence adjustment, and the method comprises the following steps:

respectively inputting each aerial image frame and a matching result corresponding to each aerial image frame into an aerial image frame sequence adjusting unit in the pre-obtained human face key feature AI identification network, carrying out frame sequence adjustment on each aerial image frame based on the aerial image frame sequence adjusting unit, obtaining a first fusion feature description of local feature dimensions output by the aerial image frame sequence adjusting unit, wherein each local aerial image feature in the first fusion feature description jointly forms the aerial image frame discrete queue;

the obtaining of the human face key feature sequence related to the human face feature set based on the aerial photography image frame discrete queue comprises:

inputting the fusion feature description into a face key feature recognition unit in the pre-obtained face key feature AI recognition network, and performing key feature classification recognition based on the face key feature recognition unit to obtain the face key feature sequence output by the face key feature recognition unit; the pre-obtained face key feature AI identification network is obtained by training according to a sampling image library, sampling images in the sampling image library comprise aerial sampling image frames of calibration matching results, and the calibration matching results are used for describing the matching condition of the aerial sampling image frames and a reference face feature set.

Based on a possible implementation manner of the first aspect, the respectively inputting each of the aerial image frames into a pre-obtained face key feature AI identification network, performing feature classification and identification on each of the aerial image frames based on a feature identification matching unit based on a global feature dimension in the pre-obtained face key feature AI identification network, performing feature matching on image classification features obtained by the feature classification and identification and the face feature set, and outputting matching results corresponding to each of the aerial image frames, includes:

inputting each aerial image frame into the feature identification matching unit respectively, and projecting each aerial image frame to a set image description space based on a global feature description subunit in the feature identification matching unit to obtain feature description of each aerial image frame;

respectively converting the feature description of each aerial image frame into corresponding visual feature description through a preset image feature extraction algorithm;

based on the feature identification matching unit, respectively extracting key features between the visual feature description of each aerial image frame and the visual feature descriptions of other aerial image frames;

and obtaining a matching result between each aerial image frame and the face feature set based on the key features corresponding to each aerial image frame.

Based on a possible implementation manner of the first aspect, the obtaining a first fused feature description of a local feature dimension output by the aerial image frame sequence adjusting unit by performing frame sequence adjustment on each aerial image frame based on the aerial image frame sequence adjusting unit includes:

based on the pre-obtained face key feature AI, an aerial image frame sequence adjusting unit in the identification network projects each aerial image frame to a set image description space to obtain a local feature description sequence corresponding to each aerial image frame;

performing visual feature sampling on the local feature description sequence corresponding to each aerial image frame through feature segmentation downsampling processing to obtain visual feature description of each aerial image frame;

respectively performing feature fusion on the visual feature description of each aerial image frame according to the matching result corresponding to each aerial image frame to obtain key visual feature description of each aerial image frame;

classifying based on the key visual feature description of each aerial photography image frame to obtain two or more aerial photography image frame sets;

arranging the aerial image frame sets in sequence, adjusting the frame sequence of the aerial image frames in each aerial image frame set, and performing feature fusion on key visual feature descriptions of the aerial image frames to obtain a first fusion feature description;

the step of inputting the fusion feature description into a face key feature recognition unit in the pre-obtained face key feature AI recognition network, performing key feature classification recognition based on the face key feature recognition unit, and obtaining the face key feature sequence output by the face key feature recognition unit includes:

traversing and accessing each piece of face feature information in the face key feature sequence, wherein one face key feature in the target face key feature sequence comprises at least one piece of face feature information;

for each traversal process, inputting the previously output face feature information into the face key feature recognition unit, wherein the face key feature recognition unit is firstly input with an initialization feature description configured in advance;

calculating matching relation quantization parameters of the previously output face feature information and each local aerial image feature, wherein the matching relation quantization parameters are used for describing the matching degree between the local aerial image feature and the previously output face feature information;

performing feature fusion on the matching relationship quantization parameter and a visual feature description sequence of local aerial image features in the aerial image frame discrete queue, and inputting the visual feature description sequence into a visual feature convolution model to obtain target visual feature description of the aerial image frame discrete queue output by current traversal operation;

and obtaining the face feature information output by the current traversal operation based on the previously output face feature information and the target visual feature description, and further obtaining a corresponding target face key feature sequence according to the face feature information output by each traversal operation.

Based on a possible implementation manner of the first aspect, the method further includes a step of performing network training on the AI recognition network, where the step includes:

acquiring a sampling image library aiming at reference face feature sets, and selecting a sampling image sequence which is used for each reference face feature set and has a plurality of sampling images from the sampling image library;

respectively inputting aerial image sampling image frames contained in each screened sampling image into a global feature dimension-based feature identification matching unit in the preset human face key feature AI identification network, and obtaining a matching result corresponding to each aerial image sampling image frame output by the feature identification matching unit;

obtaining a first model cost index parameter based on the similarity between the matching result corresponding to each aerial photography sampling image frame and the corresponding calibration matching result;

respectively inputting aerial photography sampling image frames in the screened sampling images and matching results corresponding to the aerial photography sampling image frames into an aerial photography image frame sequence adjusting unit in the preset human face key feature AI identification network, classifying the aerial photography sampling image frames based on the aerial photography image frame sequence adjusting unit, and obtaining two or more aerial photography image frame sets;

performing order arrangement on each aerial image frame set based on the aerial image frame order adjusting unit to obtain a second fusion feature description of the local feature dimension output by the aerial image frame order adjusting unit;

inputting the second fusion feature description into a face key feature recognition unit in the preset face key feature AI recognition network, and performing key feature classification recognition based on the face key feature recognition unit to obtain a presumed face key feature sequence output by the face key feature recognition unit, wherein the presumed face key feature sequence comprises two or more pieces of presumed face feature information;

obtaining a second model cost index parameter based on the description component offset of the presumed face feature information in the presumed face key feature sequence and the calibrated face feature information in the calibrated face key feature sequence, and obtaining a third model cost index parameter based on the focus feature quantization index of each local aerial image feature in each aerial image frame set;

and performing network iterative optimization on the preset human face key feature AI identification network according to the first model cost index parameter, the second model cost index parameter and the third model cost index parameter until a training convergence condition is met to obtain the trained human face key feature AI identification network.

In a second aspect, an embodiment of the present invention further provides a face recognition matching system for a smart city, where the face recognition matching system includes a cloud platform and a plurality of unmanned aerial vehicles which are in communication connection with the cloud platform and are used to respectively perform aerial photography monitoring on target monitoring occasions, and the cloud platform includes:

the system comprises an acquisition module, a target face database and a monitoring module, wherein the acquisition module is used for acquiring an aerial photography monitoring image obtained by aerial photography of a preset monitored urban area by the unmanned aerial vehicle, the aerial photography monitoring image comprises a plurality of aerial photography image frames, and a pre-constructed face feature set and the target face database are acquired;

the first identification matching module is used for carrying out face identification matching on the aerial photography monitoring image and the face feature set to obtain a face key feature sequence comprising two or more pieces of face feature information;

and the second identification matching module is used for carrying out face identification matching on the face key feature sequence and the target face database and determining whether the aerial monitoring image comprises an aerial image frame matched with the face data in the target face database.

In a third aspect, an embodiment of the present invention further provides a cloud platform, where the cloud platform is respectively in communication connection with a plurality of drones for respectively performing aerial photography monitoring on target monitoring occasions, and the cloud platform includes a processor and a machine-readable storage medium, where the machine-readable storage medium is connected to the processor, the machine-readable storage medium is used to store programs, instructions, or codes, and the processor is used to execute the programs, instructions, or codes in the machine-readable storage medium, so as to implement the method.

In summary, in the embodiment of the present invention, an aerial photography monitoring image obtained by aerial photography of a preset monitored urban area by the unmanned aerial vehicle, a pre-constructed face feature set and a target face database are obtained, face recognition matching is performed on the aerial photography monitoring image and the face feature set to obtain a face key feature sequence including two or more pieces of face feature information, then face recognition matching is performed on the face key feature sequence and the target face database, and it is determined whether the aerial photography monitoring image includes an aerial photography image frame matched with face data in the target face database. Therefore, after the face key feature sequence corresponding to the aerial photography monitoring image is obtained through the face feature set, face recognition matching is carried out on the face key feature sequence and a set target face database (such as a database with face features of criminal suspects), and compared with the traditional mode that face recognition matching is carried out on each frame of image and then the face recognition matching is carried out on the target face database, the recognition efficiency can be obviously improved, and the application effect of aerial photography monitoring of the unmanned aerial vehicle is further improved.

Secondly, in this embodiment, according to a set rule, a part of face feature information (for example, a set number of pieces of face feature information whose matching degree reaches a preset matching degree) in the key feature sequence of the person may be selected to perform face recognition matching with the target face database, and another part (for example, a part whose matching degree does not reach the preset matching degree) is abandoned to perform face recognition matching with the target face database. Or, the face feature information ranked in the front (e.g., the matching degree with the face feature set is high) can be preferentially matched with the target face database for face recognition, so that the aerial image frame with obvious face features or more face features can be preferentially recognized, and people needing to be recognized can be preferentially and rapidly found in some emergency situations (e.g., target people (such as children and old people) are urgently found).

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic flow chart of a face recognition and matching method for a smart city according to an embodiment of the present invention.

Fig. 2 is a schematic diagram of an application environment of the face recognition and matching system for smart cities according to an embodiment of the present invention.

Fig. 3 is a flowchart illustrating sub-steps included in step S200 in fig. 1.

Fig. 4 is a schematic structural diagram of a cloud platform for implementing the foregoing smart city face recognition matching method according to an embodiment of the present invention.

Fig. 5 is a functional block diagram of the face recognition device in fig. 4.

Detailed Description

Referring to fig. 1, fig. 1 is a schematic flow chart of a face recognition and matching method for a smart city according to an embodiment of the present invention. In the embodiment of the present invention, as shown in fig. 2, the method can be implemented by the face recognition matching system 100 for smart city. In this embodiment, the smart city face recognition matching system 100 may include a cloud platform 11 and a plurality of unmanned aerial vehicles 12 communicatively connected to the cloud platform 11. In this embodiment, the cloud platform 11 is configured to manage and schedule each of the unmanned aerial vehicles 12, and each of the unmanned aerial vehicles 12 is respectively or mutually cooperated under the scheduling control of the cloud platform 11 to implement aerial photography monitoring on target monitoring occasions, for example, crowd aerial photography monitoring is performed on specific occasions such as bus stops, train stations, airports, city squares, and the like, so as to discover target monitoring objects (such as crime suspects or children or old people who are urgently looking for). In this embodiment, the cloud platform 11 may be a service platform which is set up based on a smart city and is used for performing remote communication with a plurality of unmanned aerial vehicles 12 in a set monitoring area to perform remote control and scheduling on the unmanned aerial vehicles 12. The cloud platform 11 may be, but is not limited to, a server with communication control capability and big data analysis capability, a computer monitoring area, a cloud service center, a machine room control center, a cloud platform, and other monitoring areas. The cloud platform 11 may be a computer device disposed in a corresponding monitoring area for communicating with the drone 12 and controlling the drone 12.

The above-mentioned method of smart city face recognition matching is described in detail below with reference to the accompanying drawings, and in this embodiment, the method includes the following steps S100 to S300, which are exemplarily described below.

Step S100, acquiring an aerial photography monitoring image obtained by aerial photography of a preset monitored urban area by the unmanned aerial vehicle, wherein the aerial photography monitoring image comprises a plurality of aerial photography image frames, and acquiring a pre-constructed human face feature set and a target human face database.

And step S200, carrying out face recognition matching on the aerial photography monitoring image and the face feature set to obtain a face key feature sequence comprising two or more pieces of face feature information.

And step S300, carrying out face recognition matching on the face key feature sequence and the target face database, and determining whether the aerial monitoring image comprises an aerial image frame matched with the face data in the target face database.

In the above embodiment, the facial feature set may be pre-acquired or constructed to include a plurality of facial related features that can be expressed by various types of features of the human face, so as to perform preliminary facial recognition matching on the aerial image frames in the aerial surveillance image, so as to determine the facial feature correlation of each aerial image frame. For example, the face-related feature may be any related feature related to a hair style, skin tone, eyes, head, face, eyebrows, etc. of a face. The target face database may be preset target face data that needs to be monitored, such as a face database related to a criminal suspect, so as to perform accurate face matching analysis on the aerial image frame. For another example, when a target person (such as a lost child or an old person) needs to be tracked in a specific occasion (such as a station), the relevant face data of the target person may be added to the target face database for performing accurate face recognition matching analysis on the target person. It should be understood that the above is only a simple exemplary illustration, and in practical implementation, the face feature set and the target face database may also include other contents or exist in other representations, and the embodiment is not limited in particular.

In summary, in this embodiment, an aerial photography monitoring image obtained by aerial photography of a preset monitored urban area by the unmanned aerial vehicle, a pre-constructed face feature set and a target face database are obtained, face recognition matching is performed on the aerial photography monitoring image and the face feature set to obtain a face key feature sequence including two or more pieces of face feature information, then face recognition matching is performed on the face key feature sequence and the target face database, and it is determined whether the aerial photography monitoring image includes an aerial photography image frame matched with face data in the target face database. Therefore, after the face key feature sequence corresponding to the aerial photography monitoring image is obtained through the face feature set, face recognition matching is carried out on the face key feature sequence and a set target face database (such as a database with face features of criminal suspects), and compared with the traditional mode that face recognition matching is carried out on each frame of image and then the face recognition matching is carried out on the target face database, the recognition efficiency can be obviously improved, and the application effect of aerial photography monitoring of the unmanned aerial vehicle is further improved.

Further, in this embodiment, the face feature information in the face key feature sequence is arranged in order according to the matching result (including, for example, feature matching degree) between each face key feature and the face feature set. Based on this, for step S300, in a possible implementation, performing face recognition matching on the face key feature sequence and the target face database, and determining whether the aerial surveillance image includes an aerial image frame matched with the face data in the target face database may include the following contents described in steps S301 and S302.

Step S301, selecting at least one piece of face feature information with the top ranking from the face key feature sequence as the face feature information to be recognized according to the matching result of each piece of face feature information and the face feature set.

Step S302, carrying out face recognition matching on each piece of face feature information to be recognized and the target face database, and determining whether the aerial monitoring image comprises an aerial image frame matched with the face data in the target face database.

Therefore, according to the method, only a part of face feature information (such as a set number of pieces of face feature information with matching degree reaching the preset matching degree) in the key feature sequence of the person is selected to be matched with the target face database for face recognition, and the other part (such as a part with matching degree not reaching the preset matching degree) is abandoned to be matched with the target face database for face recognition, so that compared with a mode of carrying out face recognition analysis matching on each image frame, the method can reduce the recognition computation amount of the cloud platform aiming at useless aerial monitoring images, and improve the recognition efficiency.

In another possible embodiment, for step S300, performing face recognition matching on the face key feature sequence and the target face database, and determining whether the aerial surveillance image includes an aerial image frame matched with the face data in the target face database, may also be:

Therefore, the face feature information which is ranked in the front (such as the face feature set with higher matching degree) is preferentially matched with the target face database for face recognition, so that the aerial image frame with more obvious face features or more face features can be preferentially recognized, and people needing to be recognized can be preferentially and rapidly found in some emergency situations (such as emergency target searching people (such as children and old people)).

Based on the above, in the embodiment of the present invention, the matching result may include a fusional matching degree quantization index for matching each aerial image frame with a plurality of facial features included in the facial feature set, respectively, for example, the greater the number of facial features included in a certain aerial image frame and in the facial feature set, the higher the corresponding matching degree is. Of course, the way of calculating the matching degree may be other ways, for example, the corresponding matching degree quantization index may be determined according to the ratio of the number of pixels in each aerial image frame that match the corresponding facial feature, which is not specifically limited in this embodiment.

Further, in a possible implementation manner, as for the step S200, as shown in fig. 3, the performing face recognition matching on the aerial surveillance image and the face feature set to obtain a key feature sequence of a face including two or more pieces of face feature information may include the following steps S201 to S203.

Step S201, performing matching analysis on each aerial image frame in the aerial monitoring image and the face feature set respectively to obtain a matching result between each aerial image frame and the face feature set respectively.

Step S202, according to the matching result corresponding to each aerial image frame and the face description of each aerial image frame, performing frame sequence adjustment on each aerial image frame, and obtaining an aerial image frame discrete queue according to each aerial image frame after frame sequence adjustment.

In this embodiment, the aerial photography image frame discrete queue may include a plurality of aerial photography image frames in the aerial photography monitoring influence, and each aerial photography image frame may be in a discrete arrangement manner to form the aerial photography image frame discrete queue.

For example, in an alternative implementation manner, in step S202, each of the aerial image frames may be classified according to the matching result corresponding to each of the aerial image frames and the face description of each of the aerial image frames, so as to obtain two or more aerial image frame sets. For example, for different aerial image frames, corresponding main human face features may be different, and corresponding human face descriptions are different, for example, the main features of the different image frames may be respectively concentrated at different positions such as a hairstyle, a mouth, eyes, a face, and the like, and then the plurality of aerial image frames may be divided into a plurality of different aerial image frame sets according to the information. For example, feature fusion can be performed on the face description of each aerial image frame according to the matching result corresponding to each aerial image frame, so as to obtain the key description of the face feature of each aerial image frame; then, classifying the aerial image frames according to the key description of the human face features of the aerial image frames to obtain two or more aerial image frame sets.

Then, the aerial image frame sets can be ordered, and the aerial image frame sets can be adjusted in frame sequence to obtain the aerial image frame discrete queue.

For example, in an alternative embodiment, each of the aerial image frame sets may be sorted according to the number of aerial image frames included in the aerial image frame set. For example, the number of included aerial image frames may be increased by placing a larger number of aerial image frames in a higher order and in a higher order for prioritization.

Secondly, for each aerial image frame set, adjusting the frame sequence of each aerial image frame in the aerial image frame set according to the human face description of each aerial image frame in the aerial image frame set and the common quantitative index of the aerial image frame set. The common quantization index may be, but is not limited to, a quantization index that a human face included in each aerial image frame describes a common property of a main feature (such as an eye feature, a skin color feature, a hair style feature, and the like) corresponding to the corresponding aerial image frame set, and for example, it may include a matching degree, a similarity degree, a feature distance, and the like, but is not limited thereto.

Then, based on the frame set arrangement result among the aerial image frame sets and the frame sequence adjustment result of each aerial image frame in each aerial image frame set, the aerial image frame discrete queue is obtained.

Step S203, obtaining a human face key feature sequence related to the human face feature set based on the aerial photography image frame discrete queue, wherein the human face key feature sequence comprises two or more pieces of human face feature information.

In this embodiment, the face feature information included in the face key feature sequence may be feature information corresponding to preset different types of face key features (such as skin color features, eye features, nose features, hair style features, and the like), which may be determined specifically according to an actual situation, and is not specifically limited herein.

Further, for step S201, an Artificial Intelligence (AI) model may be used in one possible implementation to achieve the obtaining of the matching result. In detail, in an alternative embodiment, step S201 may include: and respectively inputting each aerial image frame into a human face key feature AI identification network obtained in advance, carrying out feature classification and identification on each aerial image frame based on a feature identification matching unit based on global feature dimension in the human face key feature AI identification network obtained in advance, carrying out feature matching on image classification features obtained by feature classification and identification and the human face feature set respectively, and outputting a matching result corresponding to each aerial image frame.

For example, each aerial image frame may be firstly input into the feature recognition matching unit, and each aerial image frame is projected to a set image description space based on a global feature description subunit in the feature recognition matching unit, so as to obtain a feature description of each aerial image frame; then, respectively converting the feature description of each aerial image frame into corresponding visual feature description through a preset image feature extraction algorithm; secondly, respectively extracting key features between the visual feature description of each aerial image frame and the visual feature descriptions of other aerial image frames based on the feature identification matching unit; and finally, obtaining a matching result between each aerial image frame and the face feature set based on the key features corresponding to each aerial image frame. The visual feature description may be, for example, but not limited to, a visual feature of a color, texture, contour, shape, surface, or the like.

On the basis of the above, in step S202, the frame sequence adjustment is performed on each aerial image frame according to the matching result corresponding to each aerial image frame and the face description of each aerial image frame, and an aerial image frame discrete queue is obtained according to each aerial image frame after the frame sequence adjustment, which can also be implemented in the following manner:

and respectively inputting each aerial image frame and a matching result corresponding to each aerial image frame into an aerial image frame sequence adjusting unit in the pre-obtained human face key feature AI identification network, carrying out frame sequence adjustment on each aerial image frame based on the aerial image frame sequence adjusting unit, obtaining a first fusion feature description of local feature dimensions output by the aerial image frame sequence adjusting unit, wherein each local aerial image feature in the first fusion feature description jointly forms the aerial image frame discrete queue.

In an alternative embodiment, in this embodiment, an aerial image frame sequence adjusting unit in the network may be first identified based on the pre-obtained face key feature AI, and each aerial image frame is projected to a set image description space to obtain a local feature description sequence corresponding to each aerial image frame;

secondly, performing visual feature sampling on a local feature description sequence corresponding to each aerial image frame through feature segmentation downsampling processing to obtain visual feature description of each aerial image frame; the characteristic segmented down-sampling process can be, for example, performing segmented or sub-segmented characteristic down-sampling on each aerial image frame, and performing visual characteristic sampling on a corresponding local characteristic description sequence; alternatively, the feature segmented downsampling may be, for example, a hierarchical pooling process, and is not limited in particular;

secondly, respectively performing feature fusion on the visual feature description of each aerial image frame according to the matching result corresponding to each aerial image frame to obtain key visual feature description of each aerial image frame;

classifying based on the key visual feature description of each aerial image frame to obtain two or more aerial image frame sets;

and finally, sequentially sorting the aerial photography image frame sets, adjusting the frame sequence of the aerial photography image frames in each aerial photography image frame set, and performing feature fusion on the key visual feature description of each aerial photography image frame to obtain the first fusion feature description.

It is to be understood that the term "global feature" in the foregoing of the embodiments of the present invention may refer to a feature for evaluating or expressing the integrity of the facial features of an aerial image frame, starting from the global or overall facial features of the aerial image frame. And the local feature may refer to a feature used for evaluating or expressing a local feature (such as any dimension of skin color, hair style, contour, and the like) of the facial feature of the aerial image frame, starting from a local or partial facial feature of the aerial image frame.

Based on this, in step S203, the obtaining a face key feature sequence related to the face feature set based on the aerial image frame discrete queue may further include:

inputting the fusion feature description into a face key feature recognition unit in the pre-obtained face key feature AI recognition network, and performing key feature classification recognition based on the face key feature recognition unit to obtain the face key feature sequence output by the face key feature recognition unit. The pre-obtained face key feature AI identification network is obtained by training according to a sampling image library, sampling images in the sampling image library comprise aerial sampling image frames of calibration matching results, and the calibration matching results are used for describing the matching condition of the aerial sampling image frames and a reference face feature set.

On the basis of the above, the above-mentioned inputting the fusion feature description into the face key feature recognition unit in the pre-obtained face key feature AI recognition network, performing key feature classification recognition based on the face key feature recognition unit, and obtaining the face key feature sequence output by the face key feature recognition unit can be implemented in the following manner.

Firstly, traversing and accessing each piece of face feature information in the face key feature sequence, wherein one face key feature in the target face key feature sequence comprises at least one piece of face feature information.

Secondly, inputting the face feature information which is output in advance into the face key feature recognition unit aiming at each traversal process, wherein the initialization feature description which is configured in advance is input into the face key feature recognition unit for the first time.

And thirdly, calculating matching relation quantization parameters of the previously output face feature information and the local aerial image features, wherein the matching relation quantization parameters are used for describing the matching degree between the local aerial image features and the previously output face feature information.

In this embodiment, for example, before the matching relationship quantization parameter between the previously output face feature information and each of the local aerial image features is calculated, the target aerial image frame set and the continuous frame set of the target aerial image frame set, which are screened by the current traversal operation, may be used as the selected aerial image frame set, and the remaining aerial image frame sets may be used as the candidate aerial image frame sets. And the target aerial image frame sets screened each time are sequentially processed according to the sequence of each aerial image frame set.

Then, implanting first matching information for local aerial image features in the selected aerial image frame set in the aerial image frame discrete queue, and implanting second matching information for local aerial image features in the candidate aerial image frame set in the aerial image frame discrete queue to obtain first visual feature matching description information corresponding to each local aerial image feature; and implanting the first matching information into the previously output face feature information to obtain corresponding second visual feature matching description information.

Based on this, the calculating a quantitative parameter of a matching relationship between the previously output face feature information and each of the local aerial image features may include:

and describing second visual feature matching description information corresponding to the previously output face feature information by combining first matching information corresponding to each local aerial image feature, and calculating to obtain a matching relation quantization parameter of the previously output face feature information and each local aerial image feature based on a preset matching degree quantization function.

And fourthly, performing feature fusion on the matching relationship quantization parameter and a visual feature description sequence of the local aerial image features in the aerial image frame discrete queue, inputting the feature fusion sequence into a visual feature convolution model, and obtaining the target visual feature description of the aerial image frame discrete queue output by the current traversal operation. The visual feature convolution model may be an AI model that is pre-constructed or trained and is used for performing visual feature convolution or extraction processing on the aerial image frame, which is not limited in this embodiment.

And fifthly, acquiring the face feature information output by the current traversal operation based on the previously output face feature information and the target visual feature description.

Further, the embodiment of the present invention further provides an innovative method for network training for the AI recognition network of key features of human faces, which includes the following training steps.

(1) And acquiring a sampling image library aiming at the reference face feature set, and selecting a sampling image sequence which is used for each reference face feature set and has a plurality of sampling images from the sampling image library.

(2) And respectively inputting aerial image sampling image frames contained in each screened sampling image into a global feature dimension-based feature identification matching unit in the preset human face key feature AI identification network, and obtaining a matching result corresponding to each aerial image sampling image frame output by the feature identification matching unit.

(3) And obtaining a first model cost index parameter based on the similarity between the matching result corresponding to each aerial photography sampling image frame and the corresponding calibration matching result. The first model cost index parameter may be used to characterize a difference between the matching result and the calibrated matching result, where the smaller the index parameter is, the smaller the difference is, and the closer the performance of the model is to the required target performance.

(4) And respectively inputting the aerial photography sampling image frames in the screened sampling images and the matching results corresponding to the aerial photography sampling image frames into an aerial photography image frame sequence adjusting unit in the preset human face key feature AI identification network, and classifying the aerial photography sampling image frames based on the aerial photography image frame sequence adjusting unit to obtain two or more aerial photography image frame sets.

(5) And performing order arrangement on each aerial image frame set based on the aerial image frame order adjusting unit to obtain a second fusion feature description of the local feature dimension output by the aerial image frame order adjusting unit.

(6) Inputting the second fusion feature description into a face key feature recognition unit in the preset face key feature AI recognition network, and performing key feature classification recognition based on the face key feature recognition unit to obtain a presumed face key feature sequence output by the face key feature recognition unit, wherein the presumed face key feature sequence comprises two or more pieces of presumed face feature information.

(7) And obtaining a second model cost index parameter based on the description component offset of the presumed human face characteristic information in the presumed human face key characteristic sequence and the calibrated human face characteristic information in the calibrated human face key characteristic sequence, and obtaining a third model cost index parameter based on the focus characteristic quantization index of each local aerial image characteristic in each aerial image frame set. The focus feature quantization index of the local aerial image feature may be obtained by analyzing a plurality of preselected focus features (such as hair style features, eye features, nose features, mouth features, and the like), and may, for example, represent relevant information (such as a feature number, a corresponding pixel number ratio, and the like) of the focus features included in the local aerial image feature. In this embodiment, the focus feature quantization index may be obtained based on a preset focus feature focusing mechanism (which may be one of the attention mechanisms, such as a multi-head attention mechanism).

(8) And performing network iterative optimization on the preset human face key feature AI identification network according to the first model cost index parameter, the second model cost index parameter and the third model cost index parameter until a training convergence condition is met to obtain the trained human face key feature AI identification network.

For example, in this embodiment, the model cost index parameters may be subjected to index fusion according to preset parameter weights respectively for the first model cost index parameter, the second model cost index parameter, and the third model cost index parameter, and then the face key feature AI recognition network is subjected to network iterative optimization and network parameter adjustment according to the fused index parameters, so as to obtain a trained face key feature AI recognition network.

As shown in fig. 4, an architectural schematic diagram of a cloud platform 11 provided in the embodiment of the present invention for implementing the foregoing method is provided. In this embodiment, the cloud platform 11 may include a face recognition device 110, a machine-readable storage medium 120, and a processor 130.

In this embodiment, the machine-readable storage medium and the processor may be located in the cloud platform 11 and separately provided. The machine-readable storage medium 120 may also be independent of the cloud platform 11 and accessed by the processor 130. The face recognition apparatus 110 may include a plurality of functional modules stored in a machine-readable storage medium, for example, various software functional modules included in the face recognition apparatus 110. When the processor executes the software functional module in the face recognition device 110, the block chain big data processing method provided by the foregoing method embodiment is implemented.

In this embodiment, the cloud platform 11 may include one or more processors. The processor may process information and/or data related to the service request to perform one or more of the functions described in this disclosure. In some embodiments, a processor may include one or more processing engines (e.g., a single-core processor or a multi-core processor). For example only, the processor may include one or more hardware processors such as one of a central processing unit CPU, an application specific integrated circuit ASIC, an application specific instruction set processor ASIP, a graphics processor GPU, a physical arithmetic processing unit PPU, a digital signal processor DSP, a field programmable gate array FPGA, a programmable logic device PLD, a controller, a microcontroller unit, a reduced instruction set computer RISC, a microprocessor, or the like, or any combination thereof.

A machine-readable storage medium may store data and/or instructions. In some embodiments, a machine-readable storage medium may store the obtained data or material. In some embodiments, a machine-readable storage medium may store data and/or instructions for execution or use by the cloud platform 11, which the cloud platform 11 may execute or use to implement the example methods described herein. In some embodiments, a machine-readable storage medium may include mass storage, removable storage, volatile read-write memory, read-only memory, ROM, the like, or any combination thereof. Exemplary mass storage devices may include magnetic disks, optical disks, solid state disks, and the like. Exemplary removable memories may include flash drives, floppy disks, optical disks, memory cards, compact disks, magnetic tape, and the like. Exemplary volatile read-write memories may include random access memory RAM. Exemplary random access memories may include dynamic RAM, double-rate synchronous dynamic RAM, static RAM, thyristor RAM, zero-capacitance RAM, and the like. Exemplary ROMs may include masked ROMs, programmable ROMs, erasable programmable ROMs, electrically erasable programmable ROMs, compact disk ROMs, digital versatile disk ROMs, and the like.

The face recognition device 110 included in the cloud platform 11 may include one or more software functional modules. The software functional modules may be stored as programs, instructions in the machine-readable storage medium, which when executed by a corresponding processor, are configured to implement the above-described method, e.g., when executed by a processor of a drone, or when executed by the cloud platform, are configured to implement the above-described method steps performed by the drone, or the cloud platform.

As shown in fig. 5, the face recognition apparatus 110 may include an obtaining module 1101, a first recognition matching module 1102, and a second recognition matching module 1103.

An obtaining module 1101, configured to obtain an aerial photography monitoring image obtained by aerial photography of a preset monitored urban area by the unmanned aerial vehicle, where the aerial photography monitoring image includes a plurality of aerial photography image frames, and obtain a pre-constructed face feature set and a target face database;

the first recognition matching module 1102 is configured to perform face recognition matching on the aerial photography monitoring image and the face feature set to obtain a face key feature sequence including two or more pieces of face feature information;

a second recognition matching module 1103, configured to perform face recognition matching on the face key feature sequence and the target face database, and determine whether the aerial surveillance image includes an aerial image frame matched with the face data in the target face database.

It should be noted that the obtaining module 1101, the first identification matching module 1102 and the second identification matching module 1103 may be respectively configured to perform the corresponding steps of S100 to S300 provided in the embodiment of the method of the present invention. For details of these modules, reference may be made to detailed embodiments of corresponding method steps, which are not described herein again.

While the invention has been described with reference to a number of embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Furthermore, the detailed description of the embodiments of the present invention provided in the accompanying drawings is not intended to limit the scope of the invention, but is merely representative of selected embodiments of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims. Moreover, all other embodiments that can be made available by a person skilled in the art without inventive step based on the embodiments of the present invention shall fall within the scope of protection of the present invention.

Claims

1. The utility model provides a smart city face identification matching method, is applied to face identification matching system's cloud platform, its characterized in that, face identification matching system still include with cloud platform communication connection just is used for respectively carrying out the unmanned aerial vehicle of taking photo by plane control to the target monitoring occasion, the method includes:

2. The method according to claim 1, wherein the performing face recognition matching on the aerial surveillance image and the face feature set to obtain a face key feature sequence including two or more pieces of face feature information comprises:

3. The method of claim 2, wherein performing face recognition matching on the face key feature sequence with the target face database to determine whether the aerial surveillance image contains aerial image frames matching the face data in the target face database comprises:

4. The method as claimed in claim 2, wherein the performing a frame sequence adjustment on each of the aerial image frames according to the matching result corresponding to each of the aerial image frames and the face description of each of the aerial image frames, and obtaining the aerial image frame discrete queue according to each of the aerial image frames after the frame sequence adjustment comprises:

5. The method of claim 2, wherein the matching analysis of each aerial image frame in the aerial surveillance image with the facial feature set to obtain a matching result between each aerial image frame and the facial feature set comprises:

6. The method according to claim 5, wherein the respectively inputting each of the aerial image frames into a pre-obtained AI identification network for key features of human face, the respectively performing feature classification and identification on each of the aerial image frames based on a global feature dimension-based feature identification and matching unit in the pre-obtained AI identification network for key features of human face, and outputting the matching result corresponding to each of the aerial image frames after performing feature matching on the image classification features obtained by the feature classification and identification with the AI feature set respectively, comprises:

7. The method according to claim 5, wherein the obtaining the first fused feature description of the local feature dimension output by the aerial image frame sequence adjustment unit based on frame sequence adjustment of each aerial image frame by the aerial image frame sequence adjustment unit comprises:

8. The method according to claim 5, further comprising a step of network training the face key feature AI recognition network, the step comprising:

9. The utility model provides a wisdom city face identification matching system, its characterized in that, face identification matching system include the cloud platform, with cloud platform communication connection just is used for respectively carrying out a plurality of unmanned aerial vehicles of taking photo by plane control to target monitoring occasion, the cloud platform includes:

10. A cloud platform, wherein the cloud platform is respectively in communication connection with a plurality of drones for respectively performing aerial monitoring on target monitoring occasions, and the cloud platform comprises a processor, a machine-readable storage medium, the machine-readable storage medium is connected with the processor, the machine-readable storage medium is used for storing programs, instructions or codes, and the processor is used for executing the programs, instructions or codes in the machine-readable storage medium to implement the method of any one of claims 1 to 8.