CN108108662B - Deep neural network recognition model and recognition method - Google Patents

Deep neural network recognition model and recognition method Download PDF

Info

Publication number
CN108108662B
CN108108662B CN201711209932.7A CN201711209932A CN108108662B CN 108108662 B CN108108662 B CN 108108662B CN 201711209932 A CN201711209932 A CN 201711209932A CN 108108662 B CN108108662 B CN 108108662B
Authority
CN
China
Prior art keywords
training
matrix
layer
weight matrix
pedestrian
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711209932.7A
Other languages
Chinese (zh)
Other versions
CN108108662A (en
Inventor
张德雷
何其佳
曾儿孟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN HARZONE TECHNOLOGY CO LTD
Original Assignee
SHENZHEN HARZONE TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN HARZONE TECHNOLOGY CO LTD filed Critical SHENZHEN HARZONE TECHNOLOGY CO LTD
Priority to CN201711209932.7A priority Critical patent/CN108108662B/en
Publication of CN108108662A publication Critical patent/CN108108662A/en
Application granted granted Critical
Publication of CN108108662B publication Critical patent/CN108108662B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The embodiment of the application discloses a deep neural network recognition model and a recognition method, wherein the method comprises the following steps: the fine tuning training layer trains the full connection layer according to the original weight matrix and the training set of the full connection layer, and obtains a target weight matrix when the training is finished; the full connection layer extracts pedestrian characteristics of a target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data. According to the embodiment of the application, the recognition speed and accuracy of pedestrian re-recognition can be improved.

Description

Deep neural network recognition model and recognition method
Technical Field
The application relates to the field of neural network algorithms, in particular to a deep neural network recognition model and a deep neural network recognition method.
Background
The pedestrian re-identification has great application potential in the fields of security protection, criminal investigation, image retrieval and the like, and the pedestrian re-identification method mainly aims at judging whether a pedestrian corresponding to an image collected in a certain camera is a target person or not. However, the accuracy of the pedestrian re-identification method in the actual monitoring environment is low, and therefore, how to improve the accuracy of the pedestrian re-identification is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The embodiment of the application provides a deep neural network recognition model and a recognition method, which can improve the recognition speed and accuracy of pedestrian re-recognition.
In a first aspect, an embodiment of the present application provides a deep neural network recognition model, a fine-tuning training layer and a full connection layer connected to the fine-tuning training layer, where the full connection layer is a neural network recognition model for completing training, where:
the fine tuning training layer is used for training the full connection layer according to the original weight matrix and the training set of the full connection layer and obtaining a target weight matrix when the training is finished;
the full connection layer is used for extracting pedestrian characteristics of a target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the fine-tuning training layer includes a decorrelation training layer, a tension training layer connected to the decorrelation training layer, and a relaxation training layer connected to the tension training layer, where:
the decorrelation training layer is used for performing singular value decomposition on the original weight matrix to obtain a first matrix, a second matrix and a third matrix, and taking the product of the first matrix and the second matrix as a reference weight matrix; the original weight matrix W is an orthogonal matrix of n x m, the first matrix U is an orthonormal matrix of n x n, the second matrix S is a diagonal matrix of n x m, and the third matrix V is an orthonormal matrix of m x m;
the tension training layer is used for fixing the reference weight matrix and training the full-connection layer according to the training set to obtain a suboptimal weight matrix;
and the relaxation training layer is used for training the full connection layer according to the suboptimal weight matrix and the training set to obtain the target weight matrix.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the deep neural network recognition model further includes a preprocessing layer connected to the fine-tuning training layer, and is configured to perform matting on a target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the full connection layer is specifically configured to obtain the similarity between the pedestrian feature and each preset data in the preset data set in a calculation manner of at least one of a euclidean distance, a mahalanobis distance, a cosine distance, or a hamming distance, so as to obtain the multiple similarities.
With reference to the first aspect or the first possible implementation manner of the first aspect, in a fourth possible implementation manner of the first aspect, the training set includes a plurality of training images corresponding to a plurality of angles, and each angle corresponds to at least one training image.
In a second aspect, an embodiment of the present application provides a method for identifying a deep neural network identification model, where the method is based on the deep neural network identification model of the first aspect, where:
the fine tuning training layer trains the full connection layer according to the original weight matrix and the training set of the full connection layer, and a target weight matrix is obtained when the training is finished;
the full connection layer extracts pedestrian characteristics of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the training, by the fine tuning training layer, the full connection layer according to the original weight matrix of the full connection layer to obtain a target weight matrix includes:
performing singular value decomposition on the original weight matrix by a decorrelation training layer to obtain a first matrix, a second matrix and a third matrix, and taking the product of the first matrix and the second matrix as a reference weight matrix; the original weight matrix W is an orthogonal matrix of n x m, the first matrix U is an orthonormal matrix of n x n, the second matrix S is a diagonal matrix of n x m, and the third matrix V is an orthonormal matrix of m x m;
the stress training layer fixes the reference weight matrix, and trains the full-connection layer according to the training set to obtain a suboptimal weight matrix;
and the relaxation training layer trains the full connection layer according to the suboptimal weight matrix and the training set to obtain the target weight matrix.
With reference to the second aspect or the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, before the fully connected layer extracts the pedestrian feature of the target image according to the target weight matrix, the method further includes:
the preprocessing layer carries out image matting on the target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.
With reference to the second aspect or the first possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the acquiring, by the full connection layer, a similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities includes:
and the full connection layer obtains the similarity between the pedestrian feature and each preset data in the preset data set in a calculation mode of at least one of Euclidean distance, Mahalanobis distance, cosine distance or Hamming distance to obtain the multiple similarities.
With reference to the second aspect or the first possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the training set includes a plurality of training images corresponding to angles, and each angle corresponds to at least one training image.
In a third aspect, the present embodiments provide a computer-readable storage medium storing a computer program, the computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method of the second aspect.
After the deep neural network recognition model and the recognition method are adopted, the full-connection layer is trained through the fine-tuning training layer according to the original weight matrix and the training set of the full-connection layer, and a target weight matrix is obtained after the training is finished; the pedestrian feature of the target image is extracted through the full-connection layer according to the target weight matrix, the similarity between the pedestrian feature and each preset data in the preset data set is obtained to obtain a plurality of similarities, the similarities are sequenced to obtain preset data corresponding to the maximum similarity, and when the maximum similarity is larger than a preset threshold value, the pedestrian feature data in the target image are determined according to the preset data. That is to say, the weight matrix of the full connection layer is optimized through the fine tuning training layer, so that the recognition efficiency and accuracy of the deep neural network recognition model are improved, namely, the recognition speed and accuracy of human re-recognition are improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Wherein:
FIG. 1 is a schematic structural diagram of a deep neural network recognition model provided in an embodiment of the present application;
fig. 1A is a schematic structural diagram of a fine tuning training layer according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of an identification method of a neural network identification model according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
The embodiment of the application provides a deep neural network recognition model and a recognition method, and the deep neural network recognition model is based on a trained neural network recognition model. That is to say, this application is to carrying out fine setting training to neural network recognition model, can improve pedestrian's recognition rate and rate of accuracy again discerned. The present application is described in further detail below with reference to specific embodiments and with reference to the attached drawings.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a deep neural network recognition model provided in the present application. As shown in fig. 1, the deep neural network recognition model 100 includes a fine-tuning training layer 102 and a fully-connected layer 104, where the fine-tuning training layer 102 is a layer above the fully-connected layer 104, i.e., an output layer of the fine-tuning training layer 102 is connected to an input layer of the fully-connected layer 104, and the fully-connected layer 104 is a trained neural network recognition model.
In the present application, the training set of the deep neural network recognition model 100 is consistent with the training set of the neural network recognition model, and includes training images corresponding to a plurality of angles, and each angle corresponds to at least one training image, and the training images may be images acquired by different cameras, or images acquired by the same camera at different angles of the same person. When the training set is a plurality of test images corresponding to a plurality of angles, training can be performed aiming at different angles, accuracy of pedestrian re-identification at different angles is improved, and subsequent tracking monitoring is facilitated. The number of training images included in the training set in the neural network recognition model 100 is not limited, and the more the number is, the more the training times are, and the higher the accuracy of the neural network recognition model 100 is.
Optionally, the full connection layer 104 is configured to perform angle recognition on the training image to obtain an angle of the training image; and carrying out characteristic recognition on the training image according to the angle to obtain the pedestrian characteristic of the pedestrian object corresponding to the training image. That is to say, the angle of the training image is determined first, and then the feature recognition is performed on the angle, so that the accuracy of attribute recognition can be improved.
For example, the gender attribute of the pedestrian includes male and female, and the angle attribute is divided into 3 types of front, side and back, so as to obtain the output result through the 6 types of attributes for the above case, and then obtain the final output result according to the previous 2 types of attributes for male and female, thereby improving the accuracy of the pedestrian feature identification.
In the present application, the convergence value of the loss of training is set as a first threshold value, and the threshold value of the number of times of training is set as a second threshold value.
It should be noted that, the deep neural network recognition model 100 and the neural network recognition model related in the present application are consistent with training methods of other neural network models, that is, one training period is completed by a single forward operation and a single backward gradient propagation. That is, the forward operation is performed in accordance with the connection manner in the deep neural network recognition model 100; when the loss between the output attribute obtained by the deep neural network recognition model 100 and the expected output attribute is greater than a first threshold and the training frequency is less than or equal to a second threshold, carrying out inverse gradient propagation according to the loss, namely correcting the weight of each layer in a loss gradient descending manner, and adjusting the weight of each layer. Through the repeated information forward propagation and loss backward gradient propagation processes, the deep neural network recognition model 100 is trained, the loss output by the deep neural network recognition model 100 can be reduced, and the recognition accuracy is improved.
Optionally, the fine training layer 102 is configured to: loss ═ yp-y)2The loss is calculated.
Wherein, ypTo the desired output attribute, y is the output attribute.
In the present application, the fine tuning training layer 102 is configured to train the full-link layer according to the original weight matrix and the training set of the full-link layer, and obtain a target weight matrix when the training is completed.
The weight matrix of the fully-connected layer 104 determines the accuracy of the fully-connected layer 104 and the neural network recognition model, and when the fully-connected layer 104 is trained, the accuracy of the deep neural network recognition model 100 can be improved on the basis of the neural network recognition model.
Optionally, as shown in fig. 1A, the fine-tuning training layer 102 includes a decorrelation training layer 1021, a tension training layer 1022 connected to the decorrelation training layer 1021, and a relaxation training layer 1023 connected to the tension training layer 1022.
The decorrelation training layer 1021 is configured to perform Singular Value Decomposition (SVD) on the original weight matrix to obtain a first matrix, a second matrix, and a third matrix, and use a product between the first matrix and the second matrix as a reference weight matrix.
Singular value decomposition is a data analysis method used to find "patterns" implicit in a large amount of data, and it can be used in pattern recognition, data compression, etc. to map a data set into a low-dimensional space. The specific calculation method is as follows: assuming that the original weight matrix W is an m x n-order complex matrix, then
W=USV'
Where U is an orthonormal matrix of n × n, S is a diagonal matrix of n × m, and V is an orthonormal matrix of m × m, then reference weight matrix W' is US.
The tension training layer 1022 is configured to fix the reference weight matrix, and train the full connection layer 104 according to the training set to obtain a suboptimal weight matrix.
After the fine tuning training layer 102 is added, the weight of the full connection layer 104 is changed, a plurality of weight models are obtained each time the tension training layer 1022 performs training, the plurality of weight models are ranked, and the optimal weight model is selected for the next relaxation training.
And a relaxed training layer 1023, configured to train the fully-connected layer 104 according to the suboptimal weight matrix and the training set, so as to obtain the target weight matrix.
The fixation of a suboptimal weight matrix is cancelled, the full connection layer 104 is trained according to the suboptimal weight matrix and a training set, a plurality of weight models can be generated during each training of the full connection layer 104, the weight models are sorted, the optimal weight model is selected for next relaxation training, the training is completed when the full connection layer 104 is converged, and a target weight matrix is obtained.
In the present application, the full connection layer 104 is configured to extract pedestrian features of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.
The preset threshold is the lowest similarity required by the pedestrians in the determined image, namely when the preset threshold is larger than the preset threshold, the preset data can be determined to be the characteristic data of the pedestrians in the target image.
The calculation method of the similarity is not limited in the present application, and may be an euclidean distance, a mahalanobis distance, a cosine distance, or a hamming distance.
It can be understood that the full connection layer 104 plays a role in feature extraction and feature recognition in the entire deep neural network recognition model 100, obtains the similarity between the pedestrian feature in the target image and each preset data in the data set, and determines the feature data of the pedestrian in the target image according to whether the maximum similarity is greater than a preset threshold, thereby completing the process of re-recognition of the pedestrian.
Optionally, the deep neural network recognition model 100 further includes a preprocessing layer connected to the fine tuning training layer 102, and is configured to perform matting on a target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.
The target test image is any training image or an image acquired by any camera. It can be understood that after the pedestrian image in the target test image is extracted, the part irrelevant to the pedestrian image is deleted, and the accuracy of pedestrian identification is improved.
Since each neural network model has a requirement on the basic input size of the input image, that is, only images corresponding to the basic input size can be processed, the pedestrian image is subjected to size processing, so that the image size of the processed target image is consistent with the basic input size of the deep neural network recognition model, that is, the basic input size of the neural network recognition model is consistent.
Optionally, when the image size of the pedestrian image is smaller than the basic input size, filling the pedestrian image according to the basic input size and a preset image; when the image size of the pedestrian image is larger than the basic input size of the deep neural network recognition model, scaling the pedestrian image according to the basic input size.
The preset image may be an image of the same color, an image with a lower resolution, or the like, and is not described herein again.
In the deep neural network recognition model shown in fig. 1, the full-link layer is trained by the fine-tuning training layer 102 according to the original weight matrix and the training set of the full-link layer, and a target weight matrix is obtained when the training is completed; the pedestrian features of the target image are extracted through the full connection layer 104 according to the target weight matrix, the similarity between the pedestrian features and each preset data in the preset data set is obtained to obtain a plurality of similarities, the similarities are sequenced to obtain preset data corresponding to the maximum similarity, and when the maximum similarity is larger than a preset threshold value, the feature data of the pedestrian in the target image are determined according to the preset data. That is to say, the weight matrix of the fully-connected layer 104 is optimized through the fine-tuning training layer 102, so that the recognition efficiency and accuracy of the deep neural network recognition model are improved, that is, the recognition speed and accuracy of human re-recognition are improved.
Referring to fig. 2, fig. 2 is a schematic flowchart of a recognition method of a deep neural network recognition model according to an embodiment of the present application, as shown in fig. 2, the method is applied to the deep neural network recognition model shown in fig. 1, where:
201: and the fine tuning training layer trains the full connection layer according to the original weight matrix and the training set of the full connection layer, and obtains a target weight matrix when the training is finished.
202: the full connection layer extracts pedestrian characteristics of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; and when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data.
Optionally, the fine tuning training layer trains the full connection layer according to the original weight matrix of the full connection layer to obtain a target weight matrix, including:
performing singular value decomposition on the original weight matrix by a decorrelation training layer to obtain a first matrix, a second matrix and a third matrix, and taking the product of the first matrix and the second matrix as a reference weight matrix; the original weight matrix W is an orthogonal matrix of n x m, the first matrix U is an orthonormal matrix of n x n, the second matrix S is a diagonal matrix of n x m, and the third matrix V is an orthonormal matrix of m x m;
the stress training layer fixes the reference weight matrix, and trains the full-connection layer according to the training set to obtain a suboptimal weight matrix;
and the relaxation training layer trains the full connection layer according to the suboptimal weight matrix and the training set to obtain the target weight matrix.
Optionally, before the fully connected layer extracts the pedestrian feature of the target image according to the target weight matrix, the method further includes:
the preprocessing layer carries out image matting on the target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.
Optionally, the fully-connected layer obtains a similarity between the pedestrian feature and each preset data in the preset data set to obtain a plurality of similarities, including:
and the full connection layer obtains the similarity between the pedestrian feature and each preset data in the preset data set in a calculation mode of at least one of Euclidean distance, Mahalanobis distance, cosine distance or Hamming distance to obtain the multiple similarities.
Optionally, the training set includes training images corresponding to a plurality of angles, and each angle corresponds to at least one training image.
In the recognition method of the deep neural network recognition model shown in fig. 2, the full-link layer is trained by the fine-tuning training layer according to the original weight matrix and the training set of the full-link layer, and a target weight matrix is obtained when the training is completed; the pedestrian feature of the target image is extracted through the full-connection layer according to the target weight matrix, the similarity between the pedestrian feature and each preset data in the preset data set is obtained to obtain a plurality of similarities, the similarities are sequenced to obtain preset data corresponding to the maximum similarity, and when the maximum similarity is larger than a preset threshold value, the pedestrian feature data in the target image are determined according to the preset data. That is to say, the weight matrix of the full connection layer is optimized through the fine tuning training layer, so that the recognition efficiency and accuracy of the deep neural network recognition model are improved, namely, the recognition speed and accuracy of human re-recognition are improved.
In another embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program comprising program instructions, which when executed by a processor, cause the processor to perform the implementation described in the recognition method of a deep neural network recognition model.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the terminal and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the above method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It is to be noted that, in the attached drawings or in the description, the implementation modes not shown or described are all the modes known by the ordinary skilled person in the field of technology, and are not described in detail. Further, the above definitions of the various elements and methods are not limited to the various specific structures, shapes or arrangements of parts mentioned in the examples, which may be easily modified or substituted by those of ordinary skill in the art.
The above-mentioned embodiments are further described in detail for the purpose of illustrating the invention, and it should be understood that the above-mentioned embodiments are only illustrative of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. The deep neural network recognition model is characterized by comprising a fine-tuning training layer and a full connection layer connected with the fine-tuning training layer, wherein the full connection layer is a neural network recognition model after training is completed, a training set of the deep neural network recognition model is consistent with a training set of the neural network recognition model, the deep neural network recognition model comprises training images corresponding to a plurality of angles, and each angle corresponds to at least one training image, wherein:
the fine tuning training layer is used for training the full connection layer according to the original weight matrix of the full connection layer and the training set of the deep neural network recognition model, and obtaining a target weight matrix when the training is completed;
the full connection layer is used for extracting pedestrian characteristics of a target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data;
the full connection layer is specifically used for carrying out angle identification on a target image to obtain an angle of the target image; performing feature recognition on the target image according to the target weight matrix and the angle to obtain pedestrian features of a pedestrian object corresponding to the target image; the fine-tuning training layer comprises a decorrelation training layer, a tension training layer connected with the decorrelation training layer, and a relaxation training layer connected with the tension training layer, wherein:
the decorrelation training layer is used for performing singular value decomposition on the original weight matrix to obtain a first matrix, a second matrix and a third matrix, and taking the product of the first matrix and the second matrix as a reference weight matrix; the original weight matrix W is an orthogonal matrix of n x m, the first matrix U is an orthonormal matrix of n x n, the second matrix S is a diagonal matrix of n x m, and the third matrix V is an orthonormal matrix of m x m;
the tension training layer is used for fixing the reference weight matrix and training the full-connection layer according to the training set to obtain a suboptimal weight matrix;
and the relaxation training layer is used for training the full connection layer according to the suboptimal weight matrix and the training set to obtain the target weight matrix.
2. The deep neural network recognition model of claim 1, further comprising a preprocessing layer connected to the fine tuning training layer for matting a target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.
3. The deep neural network recognition model of claim 1, wherein the fully-connected layer is specifically configured to obtain the similarity between the pedestrian feature and each preset data in the preset data set by at least one calculation method of euclidean distance, mahalanobis distance, cosine distance, or hamming distance, so as to obtain the plurality of similarities.
4. A method for identifying a deep neural network identification model, wherein the method is based on the deep neural network identification model of any one of claims 1 to 3, and the method comprises the following steps:
the fine tuning training layer trains the full connection layer according to the original weight matrix of the full connection layer and the training set of the deep neural network recognition model, and obtains a target weight matrix when the training is completed, the full connection layer is the neural network recognition model after the training is completed, the training set of the deep neural network recognition model is consistent with the training set of the neural network recognition model, the deep neural network recognition model comprises training images corresponding to a plurality of angles, and each angle corresponds to at least one training image;
the full connection layer extracts pedestrian characteristics of the target image according to the target weight matrix; acquiring the similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities; sequencing the plurality of similarities to obtain preset data corresponding to the maximum similarity; when the maximum similarity is larger than a preset threshold value, determining the characteristic data of the pedestrian in the target image according to the preset data;
the full-connection layer extracts the pedestrian characteristics of the target image according to the target weight matrix, and the method comprises the following steps:
the full connection layer carries out angle identification on the target image to obtain the angle of the target image; performing feature recognition on the target image according to the target weight matrix and the angle to obtain pedestrian features of a pedestrian object corresponding to the target image; the fine tuning training layer trains the full connection layer according to the original weight matrix of the full connection layer to obtain a target weight matrix, and the method comprises the following steps:
performing singular value decomposition on the original weight matrix by a decorrelation training layer to obtain a first matrix, a second matrix and a third matrix, and taking the product of the first matrix and the second matrix as a reference weight matrix; the original weight matrix W is an orthogonal matrix of n x m, the first matrix U is an orthonormal matrix of n x n, the second matrix S is a diagonal matrix of n x m, and the third matrix V is an orthonormal matrix of m x m;
the stress training layer fixes the reference weight matrix, and trains the full-connection layer according to the training set to obtain a suboptimal weight matrix;
and the relaxation training layer trains the full connection layer according to the suboptimal weight matrix and the training set to obtain the target weight matrix.
5. The method of claim 4, wherein before the fully connected layer extracts pedestrian features of a target image according to the target weight matrix, the method further comprises:
the preprocessing layer carries out image matting on the target test image to obtain a pedestrian image; and carrying out size processing on the pedestrian image to obtain the target image so as to enable the image size of the target image to be consistent with the basic input size of the deep neural network recognition model.
6. The method according to claim 4, wherein the fully connected layer obtains a similarity between the pedestrian feature and each preset data in a preset data set to obtain a plurality of similarities, including:
and the full connection layer obtains the similarity between the pedestrian feature and each preset data in the preset data set in a calculation mode of at least one of Euclidean distance, Mahalanobis distance, cosine distance or Hamming distance to obtain the multiple similarities.
CN201711209932.7A 2017-11-24 2017-11-24 Deep neural network recognition model and recognition method Active CN108108662B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711209932.7A CN108108662B (en) 2017-11-24 2017-11-24 Deep neural network recognition model and recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711209932.7A CN108108662B (en) 2017-11-24 2017-11-24 Deep neural network recognition model and recognition method

Publications (2)

Publication Number Publication Date
CN108108662A CN108108662A (en) 2018-06-01
CN108108662B true CN108108662B (en) 2021-05-25

Family

ID=62207702

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711209932.7A Active CN108108662B (en) 2017-11-24 2017-11-24 Deep neural network recognition model and recognition method

Country Status (1)

Country Link
CN (1) CN108108662B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108769598A (en) * 2018-06-08 2018-11-06 复旦大学 Across the camera video method for concentration identified again based on pedestrian
CN110008997B (en) * 2019-03-06 2023-11-24 平安科技(深圳)有限公司 Image texture similarity recognition method, device and computer readable storage medium
CN110493598A (en) * 2019-08-12 2019-11-22 北京中科寒武纪科技有限公司 Method for processing video frequency and relevant apparatus
WO2021068142A1 (en) * 2019-10-09 2021-04-15 深圳大学 Training method and detection method for automatically identifying recaptured image of original document
CN111191675B (en) * 2019-12-03 2023-10-24 深圳市华尊科技股份有限公司 Pedestrian attribute identification model realization method and related device
CN111191601B (en) * 2019-12-31 2023-05-12 深圳云天励飞技术有限公司 Method, device, server and storage medium for identifying peer users
CN111191602B (en) * 2019-12-31 2023-06-13 深圳云天励飞技术有限公司 Pedestrian similarity acquisition method and device, terminal equipment and readable storage medium
CN111242217A (en) * 2020-01-13 2020-06-05 支付宝实验室(新加坡)有限公司 Training method and device of image recognition model, electronic equipment and storage medium
CN111860629A (en) * 2020-06-30 2020-10-30 北京滴普科技有限公司 Jewelry classification system, method, device and storage medium
CN112307227B (en) * 2020-11-24 2023-08-29 国家电网有限公司大数据中心 Data classification method
CN112560720A (en) * 2020-12-21 2021-03-26 奥比中光科技集团股份有限公司 Pedestrian identification method and system
CN113205451A (en) * 2021-03-30 2021-08-03 北京达佳互联信息技术有限公司 Image processing method, image processing device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105979210A (en) * 2016-06-06 2016-09-28 深圳市深网视界科技有限公司 Pedestrian identification system based on multi-ball multi-gun camera array
CN106503687A (en) * 2016-11-09 2017-03-15 合肥工业大学 The monitor video system for identifying figures of fusion face multi-angle feature and its method
CN106778464A (en) * 2016-11-09 2017-05-31 深圳市深网视界科技有限公司 A kind of pedestrian based on deep learning recognition methods and device again
CN107301380A (en) * 2017-06-01 2017-10-27 华南理工大学 One kind is used for pedestrian in video monitoring scene and knows method for distinguishing again

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9728184B2 (en) * 2013-06-18 2017-08-08 Microsoft Technology Licensing, Llc Restructuring deep neural network acoustic models
CN104850531A (en) * 2014-02-19 2015-08-19 日本电气株式会社 Method and device for establishing mathematical model
CN104239892A (en) * 2014-08-25 2014-12-24 西安电子科技大学 SAR image mixed model fitting method based on KSVD training dictionary
CN107341541B (en) * 2016-04-29 2021-01-29 中科寒武纪科技股份有限公司 Apparatus and method for performing full connectivity layer neural network training
CN106295584A (en) * 2016-08-16 2017-01-04 深圳云天励飞技术有限公司 Depth migration study is in the recognition methods of crowd's attribute
CN106355248A (en) * 2016-08-26 2017-01-25 深圳先进技术研究院 Deep convolution neural network training method and device
CN106803063B (en) * 2016-12-21 2019-06-28 华中科技大学 A kind of metric learning method that pedestrian identifies again

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105979210A (en) * 2016-06-06 2016-09-28 深圳市深网视界科技有限公司 Pedestrian identification system based on multi-ball multi-gun camera array
CN106503687A (en) * 2016-11-09 2017-03-15 合肥工业大学 The monitor video system for identifying figures of fusion face multi-angle feature and its method
CN106778464A (en) * 2016-11-09 2017-05-31 深圳市深网视界科技有限公司 A kind of pedestrian based on deep learning recognition methods and device again
CN107301380A (en) * 2017-06-01 2017-10-27 华南理工大学 One kind is used for pedestrian in video monitoring scene and knows method for distinguishing again

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SVDNet for Pedestrian Retrieval;Yifan S.等;《百度学术》;20170806;第1-9页 *

Also Published As

Publication number Publication date
CN108108662A (en) 2018-06-01

Similar Documents

Publication Publication Date Title
CN108108662B (en) Deep neural network recognition model and recognition method
US10402627B2 (en) Method and apparatus for determining identity identifier of face in face image, and terminal
CN108021933B (en) Neural network recognition device and recognition method
EP2676224B1 (en) Image quality assessment
CN110659665B (en) Model construction method of different-dimension characteristics and image recognition method and device
EP2907082B1 (en) Using a probabilistic model for detecting an object in visual data
CN107958230B (en) Facial expression recognition method and device
CN103942577A (en) Identity identification method based on self-established sample library and composite characters in video monitoring
CN105184260B (en) A kind of image characteristic extracting method and pedestrian detection method and device
CN110569731A (en) face recognition method and device and electronic equipment
CN109426831B (en) Image similarity matching and model training method and device and computer equipment
CN111401339B (en) Method and device for identifying age of person in face image and electronic equipment
CN110674685A (en) Human body analytic segmentation model and method based on edge information enhancement
KR101326691B1 (en) Robust face recognition method through statistical learning of local features
CN111814690A (en) Target re-identification method and device and computer readable storage medium
CN113449704A (en) Face recognition model training method and device, electronic equipment and storage medium
CN113255557A (en) Video crowd emotion analysis method and system based on deep learning
JP2014228995A (en) Image feature learning device, image feature learning method and program
CN113673465A (en) Image detection method, device, equipment and readable storage medium
CN106407942B (en) Image processing method and device
CN105224957B (en) A kind of method and system of the image recognition based on single sample
CN109598201B (en) Action detection method and device, electronic equipment and readable storage medium
CN111985434A (en) Model-enhanced face recognition method, device, equipment and storage medium
CN109165551B (en) Expression recognition method for adaptively weighting and fusing significance structure tensor and LBP characteristics
CN110781866A (en) Panda face image gender identification method and device based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant