CN111914658A

CN111914658A - Pedestrian identification method, device, equipment and medium

Info

Publication number: CN111914658A
Application number: CN202010640777.XA
Authority: CN
Inventors: 张雷; 潘华东; 殷俊
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2020-07-06
Filing date: 2020-07-06
Publication date: 2020-11-10
Anticipated expiration: 2040-07-06
Also published as: CN111914658B

Abstract

The application provides a pedestrian identification method, a pedestrian identification device, pedestrian identification equipment and a pedestrian identification medium. When the pedestrian recognition is carried out, a first image to be detected and a second image of a target person are input into a pedestrian recognition model which is trained in advance, based on the pedestrian recognition model, first identification information of whether the first image and the second image belong to the image of the same pedestrian or not is obtained, and when the first image and the second image are determined to belong to the image of the same pedestrian according to the first identification information, the image of the target person is determined to be detected. Because whether this first image and second image belong to the first identification information of the image of same pedestrian in this application based on this pedestrian identification model, consequently can be accurate discernment input image be for belonging to the image of same pedestrian, need not user's participation in this in-process, improved pedestrian's discernment efficiency and degree of accuracy.

Description

Pedestrian identification method, device, equipment and medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a medium for pedestrian recognition.

Background

The use of more and more image capturing devices generates a huge amount of images, which are generally based on an image database storing these images when performing a correlation analysis of a target person. When searching for an image of a target person in an image database, for example, capturing a suspect and searching for a lost old person or child, a method of searching for an image is generally employed. How to quickly and efficiently identify the image of the target person in the image database is a very challenging problem. At present, when searching for a target person in a picture searching mode, a face recognition method is generally adopted. However, in an actual scene, due to a complex environment, a low resolution of an image acquisition device, and the like, it is difficult to obtain a clear face image, and the face recognition method cannot rapidly and accurately recognize the image of the target person.

In the conventional image searching, all images meeting the query condition are searched and found in an image database according to image content information or specified query criteria by giving an image of a target person. And arranging all the searched images from high to low according to the similarity of the images with the target person, and manually screening according to the similarity. In order to reduce the complexity of manual screening, a similarity threshold value can be manually preset, so that only images with the similarity greater than the similarity threshold value are displayed, and the images are screened according to the threshold value.

However, in any image searching method, user participation is required, and because the image of the same pedestrian in the image database may be a front image, a side image or a back image, when searching images based on a face recognition method, the recognition accuracy will be affected.

Disclosure of Invention

The application provides a pedestrian identification method, a device, equipment and a medium, which are used for solving the problems that when pedestrian identification is carried out by using a graph searching method, user intervention is needed and the identification precision is low.

The embodiment of the application provides a pedestrian identification method, which comprises the following steps:

acquiring a first image to be detected and a second image of a target person;

determining whether the first image and the second image belong to first identification information of an image of the same pedestrian according to the first image and the second image based on a pre-trained pedestrian recognition model;

and if the first image and the second image are determined to belong to the same pedestrian according to the first identification information, determining that the image of the target person is detected.

In a possible implementation manner, the determining, based on the pre-trained pedestrian recognition model and according to the first image and the second image, whether the first image and the second image belong to first identification information of an image of the same pedestrian includes:

determining first identification information of whether the first image and the second image are the same as a pedestrian or not based on a pre-trained pedestrian recognition model, and determining at least one of a posture identification of the pedestrian contained in the first image and the second image and an ID of the pedestrian contained in the first image and the second image.

In one possible embodiment, the pedestrian recognition model is trained by:

acquiring at least two sample images in a sample set, wherein each sample image corresponds to a first ID of a pedestrian contained in the sample image, a first posture identification of the pedestrian contained in each sample image, and second identification information of whether each two sample images belong to the image of the same pedestrian;

acquiring a second ID of the pedestrian contained in each sample image, a second posture identification of the pedestrian contained in each sample image and third identification information of whether each two sample images belong to the image of the same pedestrian or not according to the at least two sample images through an original pedestrian identification model;

and adjusting parameters of the original pedestrian recognition model according to the first ID, the second ID, the first posture identification and the second posture identification of each sample image in the sample set, and the second identification information and the third identification information of any two sample images.

In a possible implementation manner, the adjusting, according to the first ID, the second ID, the first posture identifier, the second posture identifier, and the second identification information and the third identification information of any two sample images in the sample set, the parameters of the original pedestrian recognition model includes:

determining a first loss value according to whether the first ID and the second ID of each sample image in the sample set are consistent;

determining a second loss value according to whether the first posture identification and the second posture identification of each sample image in the sample set are consistent;

determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent;

and determining a total loss value according to the first loss value, the second loss value and the third loss value, and adjusting parameters of the original pedestrian recognition model according to the total loss value.

In a possible implementation, the determining a first loss value according to whether the first ID and the second ID of each sample image in the sample set are consistent comprises:

counting the times that the second ID of the sample image output by the original pedestrian recognition model is consistent with the first ID of the sample image when the sample image is input aiming at each sample image in the sample set, and determining the sub-loss value corresponding to the sample image according to the times and the total number of the sample images contained in the sample set;

and determining the first loss value according to the sub-loss value corresponding to each sample image in the sample set.

In a possible implementation, the determining a second loss value according to whether the first pose identification and the second pose identification of each sample image in the sample set are consistent includes:

counting the times that the second ID of the sample image output by the original pedestrian recognition model is consistent with the first ID of the sample image and the second posture identification of the sample image output by the original pedestrian recognition model is consistent with the first posture identification of the sample image when the sample image is input aiming at each sample image in the sample set, and determining the corresponding sub-loss value of the sample image according to the times and the total number of the sample images contained in the sample set;

and determining the second loss value according to the sub-loss value corresponding to each sample image in the sample set.

In a possible implementation manner, the determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent includes:

counting the times of coincidence of second identification information and third identification information of two sample images output by the original pedestrian recognition model aiming at any two sample images in the sample set when the two sample images are input, and determining sub-loss values corresponding to the two sample images according to the times and the total number of the sample images contained in the sample set;

and determining the third loss value according to the sub-loss values corresponding to any two sample images in the sample set.

An embodiment of the present application further provides a pedestrian recognition apparatus, the apparatus includes:

the acquisition module is used for acquiring a first image to be detected and a second image of a target person.

And the processing module is used for determining whether the first image and the second image belong to the first identification information of the image of the same pedestrian according to the first image and the second image through a pre-trained pedestrian recognition model.

And the identification module is used for determining that the first image and the second image belong to the same pedestrian according to the first identification information, and then determining that the image of the target person is detected.

In a possible implementation manner, the processing module is specifically configured to determine, through a pre-trained pedestrian recognition model, first identification information of whether the first image and the second image are the same as a pedestrian, and determine at least one of a posture identification of the pedestrian included in the first image and the second image and an ID of the pedestrian included in the first image and the second image.

In one possible embodiment, the apparatus comprises:

the training module is used for acquiring at least two sample images in a sample set, wherein each sample image corresponds to a first ID of a pedestrian contained in the sample image, a first posture identification of the pedestrian contained in each sample image, and second identification information of whether each two sample images belong to the image of the same pedestrian; acquiring a second ID of the pedestrian contained in each sample image, a second posture identification of the pedestrian contained in each sample image and third identification information of whether each two sample images belong to the image of the same pedestrian or not according to the at least two sample images through an original pedestrian identification model; and adjusting parameters of the original pedestrian recognition model according to the first ID, the second ID, the first posture identification and the second posture identification of each sample image in the sample set, and the second identification information and the third identification information of any two sample images.

In a possible implementation manner, the training module is specifically configured to determine a first loss value according to whether the first ID and the second ID of each sample image in the sample set are consistent; determining a second loss value according to whether the first posture identification and the second posture identification of each sample image in the sample set are consistent; determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent; and determining a total loss value according to the first loss value, the second loss value and the third loss value, and adjusting parameters of the original pedestrian recognition model according to the total loss value.

In a possible implementation manner, the training module is specifically configured to count, for each sample image in the sample set, a number of times that a second ID of the sample image output by the original pedestrian recognition model is consistent with a first ID of the sample image when the sample image is input, and determine a sub-loss value corresponding to the sample image according to the number of times and a total number of sample images included in the sample set; and determining the first loss value according to the sub-loss value corresponding to each sample image in the sample set.

In a possible embodiment, the training module is specifically configured to, for each sample image in the sample set, count a number of times that a second ID of the sample image output by the original pedestrian recognition model is consistent with a first ID of the sample image and a second posture identifier of the sample image output by the original pedestrian recognition model is consistent with the first posture identifier of the sample image when the sample image is input, and determine a sub-loss value corresponding to the sample image according to the number of times and a total number of sample images included in the sample set; and determining the second loss value according to the sub-loss value corresponding to each sample image in the sample set.

In a possible implementation manner, the training module is specifically configured to count, for any two sample images in the sample set, times that second identification information and third identification information of the two sample images output by the original pedestrian recognition model are consistent when the two sample images are input, and determine sub-loss values corresponding to the two sample images according to the times and the total number of the sample images included in the sample set; and determining the third loss value according to the sub-loss values corresponding to any two sample images in the sample set.

An embodiment of the present application further provides an electronic device, where the electronic device at least includes a processor and a memory, and the processor is configured to implement the steps of the pedestrian identification method according to any one of the above descriptions when executing a computer program stored in the memory.

The embodiment of the present application further provides a computer-readable storage medium, which stores a computer program, and the computer program, when executed by a processor, implements the steps of any of the above pedestrian identification methods.

When the pedestrian recognition is carried out, a first image to be detected and a second image of a target person are input into a pedestrian recognition model which is trained in advance, based on the pedestrian recognition model, first identification information of whether the first image and the second image belong to the image of the same pedestrian or not is obtained, and when the first image and the second image are determined to belong to the image of the same pedestrian according to the first identification information, the image of the target person is determined to be detected. Because whether this first image and second image belong to the first identification information of the image of same pedestrian in this application based on this pedestrian identification model, consequently can be accurate discernment input image be for belonging to the image of same pedestrian, need not user's participation in this in-process, improved pedestrian's discernment efficiency and degree of accuracy.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of a pedestrian identification detection process provided in an example of the present application;

FIG. 2 is a schematic diagram of an identification process of a neural network system;

fig. 3 is a schematic structural diagram of a pedestrian recognition device according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to avoid manual setting of a threshold value by an operator and improve efficiency and accuracy of pedestrian recognition, the application provides a pedestrian recognition method, a pedestrian recognition device, pedestrian recognition equipment and a pedestrian recognition medium.

In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Example 1:

fig. 1 is a schematic diagram of a pedestrian identification process provided in the present application, where the process includes the following steps:

s101: and acquiring a first image to be detected and a second image of the target person.

The pedestrian identification method is applied to electronic equipment, and the electronic equipment can be image acquisition equipment, a PC (personal computer) or a server and other intelligent equipment.

In the case of pedestrian recognition, the image of the pedestrian is generally recognized based on the target person, that is, based on the requirement of search, and in order to facilitate the recognition of the pedestrian, it is necessary to acquire a second image of the target person, where the second image of the target person includes image information of the target person, for example, image information of the front of the target person. The second image of the target person is identified in the image stored in the image database. The images stored in the image database are collected by each image collection device for monitoring.

And regarding any image stored in the image database as a first image to be detected, and identifying the pedestrian based on the first image and the second image.

S102: and determining whether the first image and the second image belong to first identification information of the image of the same pedestrian according to the first image and the second image based on a pre-trained pedestrian recognition model.

In order to improve the accuracy of pedestrian recognition, a pedestrian recognition model is trained in advance in the application, and the pedestrian recognition model can be a CNN network model or other network models.

When the pedestrian recognition model is trained in advance, a first image and a second image are input into the pedestrian recognition model when the pedestrian is recognized, the pedestrian recognition model is processed based on the first image and the second image, and first identification information of whether the first image and the second image belong to the image of the same pedestrian is output.

S103: and determining that the first image and the second image belong to the same pedestrian according to the first identification information, and determining that the image of the target person is detected.

Specifically, the meaning represented by each type of identification information is set in advance when the pedestrian recognition model is trained, so that after the electronic device acquires the first identification information output by the pedestrian recognition model, whether the first image and the second image are images based on the same pedestrian can be determined based on the first identification information.

For example, if the first identification information is 1, it indicates that the first image and the second image belong to the same pedestrian, and if the first identification information is 0, it indicates that the first image and the second image belong to different pedestrians.

Therefore, the electronic device may determine whether the first image and the second image are images belonging to the same pedestrian based on the first identification information output by the pedestrian recognition model, and determine that the image of the target person is recognized if the first image and the second image are determined to be images belonging to the same pedestrian based on the first identification information. And then, corresponding processing is carried out based on the first image, so that the track information of the target person can be conveniently acquired. The subsequent processing after the target person is identified is not described in detail in the embodiment of the present application.

Because this application is when carrying out pedestrian's discernment, need to wait to detect image and target personage image input to the pedestrian recognition model that the training was accomplished in advance, pedestrian recognition model can judge whether waiting to detect the image and be the target personage, when pedestrian recognition model judges that waiting to detect the image and be the target personage, the output is, when pedestrian recognition model judges that waiting to detect the image and not the target personage, output is no, avoided operator's manual setting threshold value, the efficiency and the degree of accuracy of pedestrian's discernment have been improved.

Example 2:

in order to obtain detailed information of a pedestrian to be identified, in the present application, on the basis of the above embodiment, the determining, based on a pre-trained pedestrian identification model, whether the first image and the second image belong to an image of the same pedestrian according to the first image and the second image includes:

In order to obtain detailed information of a pedestrian included in the input first image and second image, in the present application, after the first image and second image are input into the pedestrian recognition model, the pedestrian recognition model processes the first image and second image, and is capable of outputting not only first identification information whether the first image and second image belong to the same pedestrian, but also at least one of a posture identification of the pedestrian included in the first image and second image and an ID of the pedestrian included in the first image and second image.

That is, the pedestrian recognition model may output first identification information as to whether the first image and the second image belong to the same pedestrian, and a posture identification of the pedestrian included in the first image and the second image, for example, the pedestrian recognition model outputs the first identification information as to whether the first image and the second image belong to the same pedestrian, and the posture of the pedestrian included in the first image is a posture identification of the front side, and the posture of the pedestrian included in the second image is a posture identification of the back side.

The pedestrian recognition model may further output first identification information whether the first image and the second image belong to the same pedestrian, and an ID of the pedestrian included in the first image and the second image. For example, the pedestrian recognition model outputs the first identification information that the first image and the second image belong to the same pedestrian, and the ID1 of the pedestrian contained in the first image and the ID2 of the pedestrian contained in the second image.

Of course, the pedestrian recognition model may also output first identification information whether the first image and the second image belong to the same pedestrian, a posture identification of the pedestrian included in the first image and the second image, and an ID of the pedestrian included in the first image and the second image. For example, the pedestrian recognition model outputs first identification information that the first image and the second image belong to the same pedestrian, the posture of the pedestrian contained in the first image is a posture identification of the front side, the posture of the pedestrian contained in the second image is a posture identification of the back side, the ID1 of the pedestrian contained in the first image, and the ID2 of the pedestrian contained in the second image.

Because the pedestrian recognition model in the application can not only output the first identification information of whether the first image and the second image belong to the same pedestrian, but also output the posture identification of the pedestrian contained in the first image and the second image and the ID of the pedestrian contained in the first image and the second image, more detailed pedestrian information can be acquired when the pedestrian is recognized, and the accuracy of the pedestrian recognition is ensured.

Example 3:

in order to enable the pedestrian recognition, in the present application, based on the above embodiments, the pedestrian recognition model is trained as follows:

In order to implement training of a pedestrian recognition model, in the embodiment of the present application, a sample set for training is stored, and any sample image in the sample set corresponds to a posture identifier, an ID, second identification information on whether the sample image and any other sample image belong to an image of the same pedestrian, and the like. Wherein the gesture identification can identify the gesture of the pedestrian contained in the sample image, and the ID is the identification of the pedestrian contained in the sample image.

For example, if the sample image M of the sample set includes a front image of the pedestrian a, the posture mark corresponding to the sample image M is a posture mark for identifying the pedestrian as a front, and the corresponding ID is an ID of the pedestrian a. If the sample image M and the other sample image N both contain the pedestrian a, the sample image M corresponds to the identification information that contains the same pedestrian as the sample image N.

The pedestrian recognition model includes a plurality of convolution layers and a pooling layer, and the structure of the pedestrian recognition model is not particularly limited in the embodiment of the present application.

In order to enable the pedestrian recognition model in the present application to recognize the input image of the target pedestrian, at the time of training the original pedestrian recognition model, at least two sample images may be used as the input of the pedestrian recognition model, and the number of the sample images input into the original pedestrian recognition model each time may be the same or different, for example, the sample images in the sample set are divided into a plurality of groups, the number of the sample images included in each group may be 2, 3, 4, and the like, and the same sample image may be located in different groups.

After acquiring at least two sample images in a sample set, a first posture identification, a first ID and second identification information corresponding to other sample images in the at least two sample images, inputting each sample image, the first posture identification, the first ID and the second identification information into an original pedestrian recognition model, and outputting the first identification information of a pedestrian corresponding to the second posture identification, the second ID and the third identification information of the pedestrian contained in the sample image by the original pedestrian recognition model.

For example, three sample images are acquired, where sample image 1 corresponds to a first pose identifier of 1 and a first ID of 1, the second identifier of sample image 1 with respect to sample image 2 is 0, and the identifier information with respect to sample image 3 is 1; the first posture mark corresponding to the sample image 2 is 1, the first ID is ID2, the second mark of the sample image 2 relative to the sample image 1 is 0, and the mark information relative to the sample image 3 is 0; the first posture flag of the sample image 3 is 2, the first ID is ID1, the second identification information of the sample image 3 with respect to the sample image 1 is 1, and the second identification information with respect to the sample image 2 is 0.

After the original pedestrian recognition model outputs the second posture identification, the second ID and the third identification information corresponding to the sample images, the original pedestrian recognition model is trained according to the first posture identification, the first ID and the second identification information corresponding to each sample image and the second posture identification, the second ID and the third identification information output by the original pedestrian recognition model.

The pedestrian recognition model is trained in the above mode, and when the preset condition is met, the trained pedestrian recognition model is obtained. The preset condition may be that the number of sample images, of which the second posture identification, the second ID and the third identification information are consistent with the first posture identification, the first ID and the second identification information, obtained after training of the original pedestrian recognition model, of the sample images in the sample set is greater than a set number; or the iteration number of training the original pedestrian recognition model reaches the set maximum iteration number, and the like. Specifically, the embodiments of the present application do not limit this.

The following results show a specific embodiment of the training process of the pedestrian recognition model in the present application.

Fig. 2 is a schematic diagram of a training process of a pedestrian recognition model provided in an embodiment of the present application, where three sample images are obtained and input into an original pedestrian recognition model, where each of the three input sample images has a first ID corresponding to a user and a first gesture identifier corresponding to a gesture of the user, and each of the two sample images may further include second identifier information indicating whether the two sample images are located in the same pedestrian.

The original pedestrian recognition model is a CNN network-based model, and a feature extraction layer in the original pedestrian recognition model extracts features f of the three images₁、f₂、f₃The feature f obtained from the feature extraction layer₁、f₂、f₃And connecting every two sample images, transmitting the two sample images to a full connection (fc) layer, and finally outputting a second ID of the user contained in each sample image, a second posture identifier corresponding to each sample image and third identifier information of whether any two sample images are the same pedestrian image or not by the full connection layer.

Determining ID loss according to whether the first ID is consistent with the second ID, determining attitude loss according to whether the first attitude identification is consistent with the second attitude identification, determining discrimination loss according to whether the second identification information is consistent with the third identification information, determining a total loss value according to the determined ID loss, attitude loss and discrimination loss, returning the total loss value through a back propagation algorithm, and adjusting parameters of the original pedestrian recognition model.

Example 4:

in order to further improve the accuracy of pedestrian recognition, in the present application, on the basis of the foregoing embodiments, the adjusting, according to the first ID, the second ID, the first posture identifier, the second posture identifier of each sample image in the sample set, and the second identification information and the third identification information of any two sample images, the parameter of the original pedestrian recognition model includes:

The original pedestrian recognition model determines a second posture identification, a second ID and third identification information of a pedestrian contained in the sample image, and the sample image corresponds to the first posture identification, the first ID and the second identification information because each sample image is labeled in advance, so that parameters of the original pedestrian model are adjusted according to errors between the second posture identification, the second ID and the third identification information and the corresponding first posture identification, the corresponding first ID and the corresponding second identification information.

Specifically, the first loss value may be determined according to whether the first ID and the second ID of each sample image are consistent; determining a second loss value according to whether the first posture identification and the second posture identification of each sample image are consistent; and determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent.

A total loss value is determined from a sum of the first loss value, the second loss value, and the third loss value. When determining the total loss value, the total loss value may also be determined according to a weighted sum of the first loss value, the second loss value, the third loss value, and the corresponding weight values. The weight value corresponding to each loss value can be the same or different, and can be flexibly set according to the requirement during specific use.

Example 5:

in order to ensure the accuracy of the identification of the pedestrian identification model, on the basis of the above embodiments, in the present application, the determining a first loss value according to whether the first ID and the second ID of each sample image in the sample set are consistent includes:

When the pedestrian recognition model is trained, each sample image in the sample set may be recognized for multiple times, each sample image is recognized once to generate a second ID, the second ID and the first ID generate loss, the number of times that the second ID of the sample image is consistent with the first ID of the sample image is calculated, the first sub-loss value corresponding to the sample image is determined according to the total number of the sample images in the sample set, and the first loss value is determined according to the first sub-loss values corresponding to all the images in the sample image.

In the present application, the first loss value may be determined by the following formula:

wherein L is_idQ (i) is a normalization function corresponding to the first loss value, and when the second ID of the sample image j is consistent with the first ID of the sample image j, p_jIs 1, otherwise p_jIs 0 and K is the number of sample images contained in the sample set.

Example 6:

in order to ensure the accuracy of the identification of the pedestrian identification model, on the basis of the foregoing embodiments, in this application, the determining a second loss value according to whether the first pose identification and the second pose identification of each sample image in the sample set are consistent includes:

When an original pedestrian recognition model is trained, each sample image in a sample set can be recognized for multiple times, a second posture identification can be generated every time one sample image is recognized, the second posture identification and the first posture identification generate losses, the number of times that the second posture identification of the sample image is consistent with the first posture identification of the sample image is calculated, a second sub-loss value corresponding to the sample image is determined according to the total number of the sample images in the sample set, and then the second loss value is determined according to the second sub-loss values corresponding to all the images in the sample image.

In the present application, the calculation of the second loss value may be determined by the following formula:

wherein L is_posR (j, k) is a normalization function corresponding to the second loss value pair, when the second ID of the sample image j is consistent with the first ID of the sample image j and the second posture mark is consistent with the first posture mark t_jkIs 1, otherwise t_jkIs 0 and K is the number of sample images contained in the sample set.

Example 7:

in order to ensure the accuracy of the identification of the pedestrian identification model, on the basis of the foregoing embodiments, in this application, the determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent includes:

counting the times of coincidence of second identification information and third identification information of the two sample images output by the original pedestrian recognition model aiming at any two sample images in the sample set when the two sample images are input, and determining sub-loss values corresponding to the two sample images according to the times and the total number of the sample images contained in the sample set;

When the pedestrian recognition model is trained, every two sample images in the sample set can be recognized for multiple times, third identification information can be generated every time every two sample images are recognized, loss can be generated between the third identification information and the second identification information, the number of times that the third identification information of the two sample images is consistent with the second identification information of the two sample images is calculated, a third sub-loss value corresponding to the two sample images is determined according to the total number of the sample images contained in the sample set, and a third loss value is determined according to the third sub-loss value corresponding to any two images in the sample images.

In the present application, the calculation of the third loss value may be determined by the following formula:

wherein L is_veB (i) is a normalization function corresponding to the third loss value, when the third identification information corresponding to the sample image v and the sample image e and the sample image v and the sample image eWhen the second identification information corresponding to the sample image e is consistent, f_jIs 1, otherwise is 0, K is the number of sample images contained in the sample set.

Determining a total loss value according to the first loss value, the second loss value and the third loss value, wherein the total loss value can be calculated by the following formula:

L＝L_id+λ₁L_pos+λ₂L_ve

in the formula, λ₁And λ₂And the weight values are respectively corresponding to the second loss value and the third loss value.

Example 8:

fig. 3 is a schematic structural diagram of a pedestrian recognition device according to some embodiments of the present application, where the device includes:

the acquiring module 31 is configured to acquire a first image to be detected and a second image of a target person;

the processing module 32 is configured to determine, according to the first image and the second image, whether the first image and the second image belong to first identification information of an image of the same pedestrian through a pre-trained pedestrian recognition model;

and the identifying module 33 is configured to determine that the first image and the second image are images of the same pedestrian according to the first identification information, and then determine that the image of the target person is detected.

In a possible implementation, the processing module 32 is specifically configured to determine, through a pre-trained pedestrian recognition model, first identification information of whether the first image and the second image are the same as a pedestrian, and determine at least one of a posture identification of the pedestrian included in the first image and the second image and an ID of the pedestrian included in the first image and the second image.

In one possible embodiment, the apparatus comprises:

the training module 34 is used for acquiring at least two sample images in the sample set, wherein each sample image corresponds to a first ID of a pedestrian contained in the sample image, a first posture identification of the pedestrian contained in each sample image, and second identification information of whether each two sample images belong to the image of the same pedestrian; acquiring a second ID of the pedestrian contained in each sample image, a second posture identification of the pedestrian contained in each sample image and third identification information of whether each two sample images belong to the image of the same pedestrian or not according to the at least two sample images through an original pedestrian identification model; and adjusting parameters of the original pedestrian recognition model according to the first ID, the second ID, the first posture identification and the second posture identification of each sample image in the sample set, and the second identification information and the third identification information of any two sample images.

In a possible implementation, the training module 34 is specifically configured to determine a first loss value according to whether the first ID and the second ID of each sample image in the sample set are consistent; determining a second loss value according to whether the first posture identification and the second posture identification of each sample image in the sample set are consistent; determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent; and determining a total loss value according to the first loss value, the second loss value and the third loss value, and adjusting parameters of the original pedestrian recognition model according to the total loss value.

In a possible embodiment, the training module 34 is specifically configured to count, for each sample image in the sample set, a number of times that a second ID of the sample image output by the original pedestrian recognition model is consistent with a first ID of the sample image when the sample image is input, and determine a sub-loss value corresponding to the sample image according to the number of times and a total number of sample images included in the sample set; and determining the first loss value according to the sub-loss value corresponding to each sample image in the sample set.

In a possible embodiment, the training module 34 is specifically configured to, for each sample image in the sample set, count a number of times that the second ID of the sample image output by the original pedestrian recognition model is consistent with the first ID of the sample image and the second pose identification of the sample image output by the original pedestrian recognition model is consistent with the first pose identification of the sample image when the sample image is input, and determine a sub-loss value corresponding to the sample image according to the number of times and the total number of sample images included in the sample set; and determining the second loss value according to the sub-loss value corresponding to each sample image in the sample set.

In a possible implementation manner, the training module 34 is specifically configured to count, for any two sample images in the sample set, the number of times that second identification information and third identification information of the two sample images output by the original pedestrian recognition model are consistent when the two sample images are input, and determine a sub-loss value corresponding to the two sample images according to the number of times and the total number of sample images included in the sample set; and determining the third loss value according to the sub-loss values corresponding to any two sample images in the sample set.

Example 9:

fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and on the basis of the foregoing embodiments, an embodiment of the present application further provides an electronic device, as shown in fig. 4, including: the system comprises a processor 41, a communication interface 42, a memory 43 and a communication bus 44, wherein the processor 41, the communication interface 42 and the memory 43 complete mutual communication through the communication bus 44;

the memory 43 has stored therein a computer program which, when executed by the processor 41, causes the processor 41 to perform the steps of:

acquiring a first image to be detected and a second image of a target person;

In a possible implementation manner, the processor is further configured to determine, based on a pre-trained pedestrian recognition model, first identification information of whether the first image and the second image are the same pedestrian, and determine at least one of a posture identification of the pedestrian included in the first image and the second image and an ID of the pedestrian included in the first image and the second image.

In one possible embodiment, the pedestrian recognition model is trained by:

Because the principle of the electronic device for solving the problems is similar to that of the pedestrian identification method, the implementation of the electronic device can refer to the implementation of the method, and repeated details are not repeated.

The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface 42 is used for communication between the above-described electronic apparatus and other apparatuses.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

Example 10:

on the basis of the foregoing embodiments, the present application further provides a computer-readable storage medium, in which a computer program executable by a processor is stored, and when the program runs on the processor, the processor is caused to execute the following steps:

acquiring a first image to be detected and a second image of a target person;

In one possible embodiment, the pedestrian recognition model is trained by:

Because whether this first image and second image belong to the first identification information of the image of same pedestrian in this application based on this pedestrian identification model, consequently can be accurate discernment input image be for belonging to the image of same pedestrian, need not user's participation in this in-process, improved pedestrian's discernment efficiency and degree of accuracy.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A pedestrian identification method, characterized in that the method comprises:

acquiring a first image to be detected and a second image of a target person;

2. The method according to claim 1, wherein the determining whether the first image and the second image belong to first identification information of an image of the same pedestrian according to the first image and the second image based on a pre-trained pedestrian recognition model comprises:

3. The method of claim 2, wherein the pedestrian recognition model is trained by:

4. The method of claim 3, wherein the adjusting the parameters of the original pedestrian recognition model according to the first ID, the second ID, the first posture identification, the second posture identification of each sample image in the sample set, and the second identification information and the third identification information of any two sample images comprises:

5. The method of claim 4, wherein determining a first loss value based on whether the first ID and the second ID are consistent for each sample image in the sample set comprises:

6. The method of claim 4, wherein determining a second loss value based on whether the first pose identification and the second pose identification are consistent for each sample image in the sample set comprises:

7. The method of claim 4, wherein determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent comprises:

8. A pedestrian recognition apparatus, characterized in that the apparatus comprises:

the acquisition module is used for acquiring a first image to be detected and a second image of a target person;

the processing module is used for determining whether the first image and the second image belong to first identification information of an image of the same pedestrian according to the first image and the second image through a pre-trained pedestrian recognition model;

9. The apparatus according to claim 8, wherein the processing module is specifically configured to determine, through a pre-trained pedestrian recognition model, first identification information of whether the first image and the second image are the same as a pedestrian, and determine at least one of a pose identification of the pedestrian included in the first image and the second image and an ID of the pedestrian included in the first image and the second image.

10. The apparatus of claim 9, wherein the apparatus comprises:

11. The apparatus of claim 10, wherein the training module is specifically configured to determine a first loss value according to whether the first ID and the second ID of each sample image in the sample set are consistent; determining a second loss value according to whether the first posture identification and the second posture identification of each sample image in the sample set are consistent; determining a third loss value according to whether the second identification information and the third identification information of any two sample images in the sample set are consistent; and determining a total loss value according to the first loss value, the second loss value and the third loss value, and adjusting parameters of the original pedestrian recognition model according to the total loss value.

12. The apparatus according to claim 11, wherein the training module is specifically configured to count, for each sample image in the sample set, a number of times that a second ID of the sample image output by the original pedestrian recognition model is consistent with a first ID of the sample image when the sample image is input, and determine a sub-loss value corresponding to the sample image according to the number of times and a total number of sample images included in the sample set; and determining the first loss value according to the sub-loss value corresponding to each sample image in the sample set.

13. The apparatus according to claim 11, wherein the training module is specifically configured to count, for each sample image in the sample set, a number of times that the second ID of the sample image output by the original pedestrian recognition model is consistent with the first ID of the sample image and the second pose identification of the sample image output by the original pedestrian recognition model is consistent with the first pose identification of the sample image when the sample image is input, and determine the sub-loss value corresponding to the sample image according to the number of times and the total number of sample images included in the sample set; and determining the second loss value according to the sub-loss value corresponding to each sample image in the sample set.

14. The apparatus according to claim 11, wherein the training module is specifically configured to count, for any two sample images in the sample set, a number of times that second identification information and third identification information of the two sample images output by the original pedestrian recognition model are consistent when the two sample images are input, and determine a sub-loss value corresponding to the two sample images according to the number of times and a total number of sample images included in the sample set; and determining the third loss value according to the sub-loss values corresponding to any two sample images in the sample set.

15. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to carry out the steps of the pedestrian identification method according to any one of claims 1 to 7 when executing a computer program stored in the memory.

16. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the pedestrian identification method according to any one of claims 1 to 7.