CN108108662B

CN108108662B - Deep neural network recognition model and recognition method

Info

Publication number: CN108108662B
Application number: CN201711209932.7A
Authority: CN
Inventors: 张德雷; 何其佳; 曾儿孟
Original assignee: SHENZHEN HARZONE TECHNOLOGY CO LTD
Current assignee: SHENZHEN HARZONE TECHNOLOGY CO LTD
Priority date: 2017-11-24
Filing date: 2017-11-24
Publication date: 2021-05-25
Anticipated expiration: 2037-11-24
Also published as: CN108108662A

Abstract

The embodiment of the application discloses a deep neural network recognition model and a recognition method, wherein the method comprises the following steps: the fine tuning training layer trains the full connection layer according to the original weight matrix and the training set of the full connection layer, and obtains a target weight matrix when the training is finished; the full connection layer extracts pedestrian characteristics of a target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data. According to the embodiment of the application, the recognition speed and accuracy of pedestrian re-recognition can be improved.

Description

Deep neural network recognition model and recognition method

Technical Field

The application relates to the field of neural network algorithms, in particular to a deep neural network recognition model and a deep neural network recognition method.

Background

The pedestrian re-identification has great application potential in the fields of security protection, criminal investigation, image retrieval and the like, and the pedestrian re-identification method mainly aims at judging whether a pedestrian corresponding to an image collected in a certain camera is a target person or not. However, the accuracy of the pedestrian re-identification method in the actual monitoring environment is low, and therefore, how to improve the accuracy of the pedestrian re-identification is a technical problem to be solved by those skilled in the art.

Disclosure of Invention

The embodiment of the application provides a deep neural network recognition model and a recognition method, which can improve the recognition speed and accuracy of pedestrian re-recognition.

In a first aspect, an embodiment of the present application provides a deep neural network recognition model, a fine-tuning training layer and a full connection layer connected to the fine-tuning training layer, where the full connection layer is a neural network recognition model for completing training, where:

the fine tuning training layer is used for training the full connection layer according to the original weight matrix and the training set of the full connection layer and obtaining a target weight matrix when the training is finished;

the full connection layer is used for extracting pedestrian characteristics of a target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.

With reference to the first aspect, in a first possible implementation manner of the first aspect, the fine-tuning training layer includes a decorrelation training layer, a tension training layer connected to the decorrelation training layer, and a relaxation training layer connected to the tension training layer, where:

the decorrelation training layer is used for performing singular value decomposition on the original weight matrix to obtain a first matrix, a second matrix and a third matrix, and taking the product of the first matrix and the second matrix as a reference weight matrix; the original weight matrix W is an orthogonal matrix of n x m, the first matrix U is an orthonormal matrix of n x n, the second matrix S is a diagonal matrix of n x m, and the third matrix V is an orthonormal matrix of m x m;

the tension training layer is used for fixing the reference weight matrix and training the full-connection layer according to the training set to obtain a suboptimal weight matrix;

and the relaxation training layer is used for training the full connection layer according to the suboptimal weight matrix and the training set to obtain the target weight matrix.

With reference to the first aspect or the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the deep neural network recognition model further includes a preprocessing layer connected to the fine-tuning training layer, and is configured to perform matting on a target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.

With reference to the first aspect or the first possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the full connection layer is specifically configured to obtain the similarity between the pedestrian feature and each preset data in the preset data set in a calculation manner of at least one of a euclidean distance, a mahalanobis distance, a cosine distance, or a hamming distance, so as to obtain the multiple similarities.

With reference to the first aspect or the first possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the training set includes a plurality of training images corresponding to a plurality of angles, and each angle corresponds to at least one training image.

In a second aspect, an embodiment of the present application provides a method for identifying a deep neural network identification model, where the method is based on the deep neural network identification model of the first aspect, where:

the fine tuning training layer trains the full connection layer according to the original weight matrix and the training set of the full connection layer, and a target weight matrix is obtained when the training is finished;

the full connection layer extracts pedestrian characteristics of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.

With reference to the second aspect, in a first possible implementation manner of the second aspect, the training, by the fine tuning training layer, the full connection layer according to the original weight matrix of the full connection layer to obtain a target weight matrix includes:

performing singular value decomposition on the original weight matrix by a decorrelation training layer to obtain a first matrix, a second matrix and a third matrix, and taking the product of the first matrix and the second matrix as a reference weight matrix; the original weight matrix W is an orthogonal matrix of n x m, the first matrix U is an orthonormal matrix of n x n, the second matrix S is a diagonal matrix of n x m, and the third matrix V is an orthonormal matrix of m x m;

the stress training layer fixes the reference weight matrix, and trains the full-connection layer according to the training set to obtain a suboptimal weight matrix;

and the relaxation training layer trains the full connection layer according to the suboptimal weight matrix and the training set to obtain the target weight matrix.

With reference to the second aspect or the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, before the fully connected layer extracts the pedestrian feature of the target image according to the target weight matrix, the method further includes:

the preprocessing layer carries out image matting on the target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.

With reference to the second aspect or the first possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the acquiring, by the full connection layer, a similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities includes:

and the full connection layer obtains the similarity between the pedestrian feature and each preset data in the preset data set in a calculation mode of at least one of Euclidean distance, Mahalanobis distance, cosine distance or Hamming distance to obtain the multiple similarities.

With reference to the second aspect or the first possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the training set includes a plurality of training images corresponding to angles, and each angle corresponds to at least one training image.

In a third aspect, the present embodiments provide a computer-readable storage medium storing a computer program, the computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method of the second aspect.

After the deep neural network recognition model and the recognition method are adopted, the full-connection layer is trained through the fine-tuning training layer according to the original weight matrix and the training set of the full-connection layer, and a target weight matrix is obtained after the training is finished; the pedestrian feature of the target image is extracted through the full-connection layer according to the target weight matrix, the similarity between the pedestrian feature and each preset data in the preset data set is obtained to obtain a plurality of similarities, the similarities are sequenced to obtain preset data corresponding to the maximum similarity, and when the maximum similarity is larger than a preset threshold value, the pedestrian feature data in the target image are determined according to the preset data. That is to say, the weight matrix of the full connection layer is optimized through the fine tuning training layer, so that the recognition efficiency and accuracy of the deep neural network recognition model are improved, namely, the recognition speed and accuracy of human re-recognition are improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Wherein:

FIG. 1 is a schematic structural diagram of a deep neural network recognition model provided in an embodiment of the present application;

fig. 1A is a schematic structural diagram of a fine tuning training layer according to an embodiment of the present disclosure;

fig. 2 is a schematic flowchart of an identification method of a neural network identification model according to an embodiment of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

The embodiment of the application provides a deep neural network recognition model and a recognition method, and the deep neural network recognition model is based on a trained neural network recognition model. That is to say, this application is to carrying out fine setting training to neural network recognition model, can improve pedestrian's recognition rate and rate of accuracy again discerned. The present application is described in further detail below with reference to specific embodiments and with reference to the attached drawings.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a deep neural network recognition model provided in the present application. As shown in fig. 1, the deep neural network recognition model 100 includes a fine-tuning training layer 102 and a fully-connected layer 104, where the fine-tuning training layer 102 is a layer above the fully-connected layer 104, i.e., an output layer of the fine-tuning training layer 102 is connected to an input layer of the fully-connected layer 104, and the fully-connected layer 104 is a trained neural network recognition model.

In the present application, the training set of the deep neural network recognition model 100 is consistent with the training set of the neural network recognition model, and includes training images corresponding to a plurality of angles, and each angle corresponds to at least one training image, and the training images may be images acquired by different cameras, or images acquired by the same camera at different angles of the same person. When the training set is a plurality of test images corresponding to a plurality of angles, training can be performed aiming at different angles, accuracy of pedestrian re-identification at different angles is improved, and subsequent tracking monitoring is facilitated. The number of training images included in the training set in the neural network recognition model 100 is not limited, and the more the number is, the more the training times are, and the higher the accuracy of the neural network recognition model 100 is.

Optionally, the full connection layer 104 is configured to perform angle recognition on the training image to obtain an angle of the training image; and carrying out characteristic recognition on the training image according to the angle to obtain the pedestrian characteristic of the pedestrian object corresponding to the training image. That is to say, the angle of the training image is determined first, and then the feature recognition is performed on the angle, so that the accuracy of attribute recognition can be improved.

For example, the gender attribute of the pedestrian includes male and female, and the angle attribute is divided into 3 types of front, side and back, so as to obtain the output result through the 6 types of attributes for the above case, and then obtain the final output result according to the previous 2 types of attributes for male and female, thereby improving the accuracy of the pedestrian feature identification.

In the present application, the convergence value of the loss of training is set as a first threshold value, and the threshold value of the number of times of training is set as a second threshold value.

It should be noted that, the deep neural network recognition model 100 and the neural network recognition model related in the present application are consistent with training methods of other neural network models, that is, one training period is completed by a single forward operation and a single backward gradient propagation. That is, the forward operation is performed in accordance with the connection manner in the deep neural network recognition model 100; when the loss between the output attribute obtained by the deep neural network recognition model 100 and the expected output attribute is greater than a first threshold and the training frequency is less than or equal to a second threshold, carrying out inverse gradient propagation according to the loss, namely correcting the weight of each layer in a loss gradient descending manner, and adjusting the weight of each layer. Through the repeated information forward propagation and loss backward gradient propagation processes, the deep neural network recognition model 100 is trained, the loss output by the deep neural network recognition model 100 can be reduced, and the recognition accuracy is improved.

Optionally, the fine training layer 102 is configured to: loss ═ y_p-y)²The loss is calculated.

Wherein, y_pTo the desired output attribute, y is the output attribute.

In the present application, the fine tuning training layer 102 is configured to train the full-link layer according to the original weight matrix and the training set of the full-link layer, and obtain a target weight matrix when the training is completed.

The weight matrix of the fully-connected layer 104 determines the accuracy of the fully-connected layer 104 and the neural network recognition model, and when the fully-connected layer 104 is trained, the accuracy of the deep neural network recognition model 100 can be improved on the basis of the neural network recognition model.

Optionally, as shown in fig. 1A, the fine-tuning training layer 102 includes a decorrelation training layer 1021, a tension training layer 1022 connected to the decorrelation training layer 1021, and a relaxation training layer 1023 connected to the tension training layer 1022.

The decorrelation training layer 1021 is configured to perform Singular Value Decomposition (SVD) on the original weight matrix to obtain a first matrix, a second matrix, and a third matrix, and use a product between the first matrix and the second matrix as a reference weight matrix.

Singular value decomposition is a data analysis method used to find "patterns" implicit in a large amount of data, and it can be used in pattern recognition, data compression, etc. to map a data set into a low-dimensional space. The specific calculation method is as follows: assuming that the original weight matrix W is an m x n-order complex matrix, then

W＝USV'

Where U is an orthonormal matrix of n × n, S is a diagonal matrix of n × m, and V is an orthonormal matrix of m × m, then reference weight matrix W' is US.

The tension training layer 1022 is configured to fix the reference weight matrix, and train the full connection layer 104 according to the training set to obtain a suboptimal weight matrix.

After the fine tuning training layer 102 is added, the weight of the full connection layer 104 is changed, a plurality of weight models are obtained each time the tension training layer 1022 performs training, the plurality of weight models are ranked, and the optimal weight model is selected for the next relaxation training.

And a relaxed training layer 1023, configured to train the fully-connected layer 104 according to the suboptimal weight matrix and the training set, so as to obtain the target weight matrix.

The fixation of a suboptimal weight matrix is cancelled, the full connection layer 104 is trained according to the suboptimal weight matrix and a training set, a plurality of weight models can be generated during each training of the full connection layer 104, the weight models are sorted, the optimal weight model is selected for next relaxation training, the training is completed when the full connection layer 104 is converged, and a target weight matrix is obtained.

In the present application, the full connection layer 104 is configured to extract pedestrian features of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.

The preset threshold is the lowest similarity required by the pedestrians in the determined image, namely when the preset threshold is larger than the preset threshold, the preset data can be determined to be the characteristic data of the pedestrians in the target image.

The calculation method of the similarity is not limited in the present application, and may be an euclidean distance, a mahalanobis distance, a cosine distance, or a hamming distance.

It can be understood that the full connection layer 104 plays a role in feature extraction and feature recognition in the entire deep neural network recognition model 100, obtains the similarity between the pedestrian feature in the target image and each preset data in the data set, and determines the feature data of the pedestrian in the target image according to whether the maximum similarity is greater than a preset threshold, thereby completing the process of re-recognition of the pedestrian.

Optionally, the deep neural network recognition model 100 further includes a preprocessing layer connected to the fine tuning training layer 102, and is configured to perform matting on a target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.

The target test image is any training image or an image acquired by any camera. It can be understood that after the pedestrian image in the target test image is extracted, the part irrelevant to the pedestrian image is deleted, and the accuracy of pedestrian identification is improved.

Since each neural network model has a requirement on the basic input size of the input image, that is, only images corresponding to the basic input size can be processed, the pedestrian image is subjected to size processing, so that the image size of the processed target image is consistent with the basic input size of the deep neural network recognition model, that is, the basic input size of the neural network recognition model is consistent.

Optionally, when the image size of the pedestrian image is smaller than the basic input size, filling the pedestrian image according to the basic input size and a preset image; when the image size of the pedestrian image is larger than the basic input size of the deep neural network recognition model, scaling the pedestrian image according to the basic input size.

The preset image may be an image of the same color, an image with a lower resolution, or the like, and is not described herein again.

In the deep neural network recognition model shown in fig. 1, the full-link layer is trained by the fine-tuning training layer 102 according to the original weight matrix and the training set of the full-link layer, and a target weight matrix is obtained when the training is completed; the pedestrian features of the target image are extracted through the full connection layer 104 according to the target weight matrix, the similarity between the pedestrian features and each preset data in the preset data set is obtained to obtain a plurality of similarities, the similarities are sequenced to obtain preset data corresponding to the maximum similarity, and when the maximum similarity is larger than a preset threshold value, the feature data of the pedestrian in the target image are determined according to the preset data. That is to say, the weight matrix of the fully-connected layer 104 is optimized through the fine-tuning training layer 102, so that the recognition efficiency and accuracy of the deep neural network recognition model are improved, that is, the recognition speed and accuracy of human re-recognition are improved.

Referring to fig. 2, fig. 2 is a schematic flowchart of a recognition method of a deep neural network recognition model according to an embodiment of the present application, as shown in fig. 2, the method is applied to the deep neural network recognition model shown in fig. 1, where:

201: and the fine tuning training layer trains the full connection layer according to the original weight matrix and the training set of the full connection layer, and obtains a target weight matrix when the training is finished.

202: the full connection layer extracts pedestrian characteristics of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.

Optionally, the fine tuning training layer trains the full connection layer according to the original weight matrix of the full connection layer to obtain a target weight matrix, including:

Optionally, before the fully connected layer extracts the pedestrian feature of the target image according to the target weight matrix, the method further includes:

Optionally, the fully-connected layer obtains a similarity between the pedestrian feature and each preset data in the preset data set to obtain a plurality of similarities, including:

Optionally, the training set includes training images corresponding to a plurality of angles, and each angle corresponds to at least one training image.

In the recognition method of the deep neural network recognition model shown in fig. 2, the full-link layer is trained by the fine-tuning training layer according to the original weight matrix and the training set of the full-link layer, and a target weight matrix is obtained when the training is completed; the pedestrian feature of the target image is extracted through the full-connection layer according to the target weight matrix, the similarity between the pedestrian feature and each preset data in the preset data set is obtained to obtain a plurality of similarities, the similarities are sequenced to obtain preset data corresponding to the maximum similarity, and when the maximum similarity is larger than a preset threshold value, the pedestrian feature data in the target image are determined according to the preset data. That is to say, the weight matrix of the full connection layer is optimized through the fine tuning training layer, so that the recognition efficiency and accuracy of the deep neural network recognition model are improved, namely, the recognition speed and accuracy of human re-recognition are improved.

In another embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program comprising program instructions, which when executed by a processor, cause the processor to perform the implementation described in the recognition method of a deep neural network recognition model.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the terminal and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Further, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.

The above-mentioned embodiments are further described in detail for the purpose of illustrating the invention, and it should be understood that the above-mentioned embodiments are only illustrative of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The deep neural network recognition model is characterized by comprising a fine-tuning training layer and a full connection layer connected with the fine-tuning training layer, wherein the full connection layer is a neural network recognition model after training is completed, a training set of the deep neural network recognition model is consistent with a training set of the neural network recognition model, the deep neural network recognition model comprises training images corresponding to a plurality of angles, and each angle corresponds to at least one training image, wherein:

the fine tuning training layer is used for training the full connection layer according to the original weight matrix of the full connection layer and the training set of the deep neural network recognition model, and obtaining a target weight matrix when the training is completed;

the full connection layer is used for extracting pedestrian characteristics of a target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data;

the full connection layer is specifically used for carrying out angle identification on a target image to obtain an angle of the target image; performing feature recognition on the target image according to the target weight matrix and the angle to obtain pedestrian features of a pedestrian object corresponding to the target image; the fine-tuning training layer comprises a decorrelation training layer, a tension training layer connected with the decorrelation training layer, and a relaxation training layer connected with the tension training layer, wherein:

2. The deep neural network recognition model of claim 1, further comprising a preprocessing layer connected to the fine tuning training layer for matting a target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.

3. The deep neural network recognition model of claim 1, wherein the fully-connected layer is specifically configured to obtain the similarity between the pedestrian feature and each preset data in the preset data set by at least one calculation method of euclidean distance, mahalanobis distance, cosine distance, or hamming distance, so as to obtain the plurality of similarities.

4. A method for identifying a deep neural network identification model, wherein the method is based on the deep neural network identification model of any one of claims 1 to 3, and the method comprises the following steps:

the fine tuning training layer trains the full connection layer according to the original weight matrix of the full connection layer and the training set of the deep neural network recognition model, and obtains a target weight matrix when the training is completed, the full connection layer is the neural network recognition model after the training is completed, the training set of the deep neural network recognition model is consistent with the training set of the neural network recognition model, the deep neural network recognition model comprises training images corresponding to a plurality of angles, and each angle corresponds to at least one training image;

the full connection layer extracts pedestrian characteristics of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data;

the full-connection layer extracts the pedestrian characteristics of the target image according to the target weight matrix, and the method comprises the following steps:

the full connection layer carries out angle identification on the target image to obtain the angle of the target image; performing feature recognition on the target image according to the target weight matrix and the angle to obtain pedestrian features of a pedestrian object corresponding to the target image; the fine tuning training layer trains the full connection layer according to the original weight matrix of the full connection layer to obtain a target weight matrix, and the method comprises the following steps:

5. The method of claim 4, wherein before the fully connected layer extracts pedestrian features of a target image according to the target weight matrix, the method further comprises:

6. The method according to claim 4, wherein the fully connected layer obtains a similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities, including: