CN111062477B

CN111062477B - Data processing method, device and storage medium

Info

Publication number: CN111062477B
Application number: CN201911303249.9A
Authority: CN
Inventors: 高雨婷; 胡易; 余宗桥; 孙星; 彭湃; 郭晓威; 黄小明
Original assignee: Tencent Cloud Computing Beijing Co Ltd
Current assignee: Tencent Cloud Computing Beijing Co Ltd
Priority date: 2019-12-17
Filing date: 2019-12-17
Publication date: 2023-12-08
Anticipated expiration: 2039-12-17
Also published as: CN111062477A

Abstract

The application discloses a data processing method, a data processing device and a storage medium, which relate to the field of neural networks and are used for reducing the number of model parameters and the calculated amount of models. In the method, the relative information among the channels is determined by calculating the conditional entropy among the feature graphs corresponding to the channels in the initial neural network model; and pruning of the initial neural network model is realized by removing channels with small influence on other channels. Therefore, the number of parameters of the initial neural network model is reduced and the calculated amount of the initial neural network model is reduced by pruning the initial neural network model, so that the speed of model operation is improved and the requirement on deployment equipment is reduced while the effect is ensured.

Description

Data processing method, device and storage medium

Technical Field

The present application relates to the field of neural networks, and in particular, to a data processing method, apparatus, and storage medium.

Background

With the development of deep learning, the neural network is deeper, the calculated amount and the parameter amount are more and more, and for some devices with limited calculation capacity, the neural network is difficult to deploy. Many studies have demonstrated that there is a large amount of redundancy in the massive parameters, but a large amount of parameter is necessary in the model training optimization stage, which makes the model optimization problem simpler and can converge the model to a better solution.

Therefore, after model training is converged, how to prune unimportant channels in the model has great significance for reducing the model parameter and the model calculation amount.

Disclosure of Invention

The embodiment of the application provides a data processing method, a data processing device and a storage medium, which are used for reducing the number of model parameters and the calculated amount of a model, thereby improving the speed of model operation and reducing the requirement on deployment equipment while ensuring the effect.

In a first aspect, a data processing method is provided, including:

acquiring training samples of data to be processed, inputting the training samples into a trained initial neural network model, and respectively acquiring feature graphs output by each convolution layer in the initial neural network model;

for each convolution layer of the initial neural network model, calculating conditional entropy between a feature map corresponding to each channel in the convolution layer and feature maps corresponding to other channels;

for each convolution layer of the initial neural network model, averaging conditional entropy between the feature images corresponding to each channel in the convolution layer and the feature images corresponding to other channels to obtain average conditional entropy between the feature images corresponding to each channel and the feature images corresponding to other channels;

Aiming at each convolution layer of the initial neural network model, pruning the initial neural network model according to average conditional entropy corresponding to each channel in the convolution layer to obtain an optimized neural network model;

and inputting the data to be processed into the optimized neural network model to obtain processed data output by the optimized neural network model.

In a second aspect, there is provided a data processing apparatus comprising:

the characteristic diagram acquisition module is used for acquiring training samples of data to be processed, inputting the training samples into a trained initial neural network model, and respectively acquiring characteristic diagrams output by all convolution layers in the initial neural network model;

the first conditional entropy acquisition module is used for calculating the conditional entropy between the feature map corresponding to each channel in each convolution layer and the feature maps corresponding to other channels aiming at each convolution layer of the initial neural network model;

the average conditional entropy obtaining module is used for averaging the conditional entropy between the feature map corresponding to each channel and the feature map corresponding to other channels in each convolution layer of the initial neural network model to obtain average conditional entropy between the feature map corresponding to each channel and the feature map corresponding to other channels;

The pruning module is used for carrying out pruning treatment on the initial neural network model according to the average conditional entropy corresponding to each channel in each convolution layer aiming at each convolution layer of the initial neural network model to obtain an optimized neural network model;

the data acquisition module is used for inputting the data to be processed into the optimized neural network model to obtain the processed data output by the optimized neural network model.

In one embodiment, the pruning module is specifically configured to prune a preset number of channels according to the order from small to large of the average conditional entropy corresponding to each channel.

In one embodiment, determining the conditional entropy unit comprises:

the first determining probability sum subunit is used for summing the probabilities of the first channel in each interval to obtain the probability sum of the first channel; a kind of electronic device with high-pressure air-conditioning system;

a second determining probability sum subunit, configured to sum products of conditional probabilities of a second channel in each interval relative to the first channel in each interval and logarithms of the conditional probabilities to obtain a probability sum of the second channel;

and a conditional entropy determination subunit, configured to take a negative number of a product of the probability sum of the first channel and the probability sum of the second channel as conditional entropy of the second channel on the first channel.

In one embodiment, the apparatus further comprises:

the second conditional entropy acquisition module is used for acquiring the conditional entropy between the feature map corresponding to each channel and the feature map corresponding to other channels in each convolution layer according to each convolution layer of the initial neural network model after the first conditional entropy acquisition module calculates the conditional entropy between the feature map corresponding to each channel and the feature map corresponding to the first conditional entropy acquisition module;

the conditional entropy matrix generating module is used for generating a conditional entropy matrix according to the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels and the conditional entropy between the feature map corresponding to each channel and the feature map corresponding to the conditional entropy matrix generating module; the row and column of the conditional entropy matrix are the number of channels of the convolution layer, and the conditional entropy in each row of the conditional entropy matrix is the conditional entropy between the feature map corresponding to the same channel and the feature maps corresponding to other channels.

In one embodiment, the apparatus further comprises:

the training module is used for pruning the initial neural network model according to the average conditional entropy corresponding to each channel in each convolution layer of the initial neural network model by the pruning module, and training the optimized neural network model after obtaining the optimized neural network model.

In a third aspect, a computing device is provided, comprising at least one processing unit, and at least one storage unit, wherein the storage unit stores a computer program, which when executed by the processing unit, causes the processing unit to perform the steps of any of the data processing methods described above.

In one embodiment, the computing device may be a server or a terminal device.

In a fourth aspect, a computer readable medium is provided, storing a computer program executable by a terminal device, which when run on the terminal device causes the terminal device to perform the steps of any one of the data processing methods described above.

According to the data processing method, the data processing device and the storage medium, the relative information among the channels is determined by calculating the conditional entropy among the feature graphs corresponding to the channels in the initial neural network model; and pruning of the initial neural network model is realized by removing channels with small influence on other channels. Therefore, the number of parameters of the initial neural network model is reduced and the calculated amount of the initial neural network model is reduced by pruning the initial neural network model, so that the speed of model operation is improved and the requirement on deployment equipment is reduced while the effect is ensured.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a flow chart of a data processing method in an embodiment of the application;

FIG. 2 is a flow chart of a data processing method according to an embodiment of the application;

FIG. 3 is a flow chart of a complete method of data processing according to an embodiment of the application;

FIG. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a terminal device in an embodiment of the present application.

Detailed Description

In order to reduce the number of model parameters and the calculation amount of the model, thereby improving the speed of model operation and reducing the requirement on deployment equipment while ensuring the effect, the embodiment of the application provides a data processing method, a data processing device and a storage medium. In order to better understand the technical scheme provided by the embodiment of the application, the basic principle of the scheme is briefly described as follows:

In order to facilitate a better understanding of the technical solutions in the embodiments of the present application, the following describes the technical terms related to the embodiments of the present application.

Feature map: refers to the image features generated after the convolution of the training samples and the convolution kernels in the neural network model. In the embodiment of the application, a training sample and a channel of a convolution layer can obtain a feature map.

Conditional entropy: conditional entropy identification represents the uncertainty of the random variable Y under the condition that the random variable X is known. In the embodiment of the application, conditional entropy refers to the influence of one channel on other channels.

Pruning a model: evaluating the unimportant parts of the model, cutting it off, can realize reducing model parameters and calculation amount. The pruning channel is to delete the channel.

The following briefly describes the design concept of the embodiment of the present application.

As described above, how to prune the insignificant channels in the model after the model training converges is very important for reducing the number of model parameters and the calculation amount of the model. In the prior art, model pruning mostly considers the importance of each channel separately, usually by evaluating the importance of each channel in the model and pruning unimportant channels according to the importance. But the prior art ignores the correlation between the different channels. In order to provide a data processing method, a data processing device and a storage medium. In the method, the relative information among the channels is determined by calculating the conditional entropy among the feature graphs corresponding to the channels in the initial neural network model; and pruning of the initial neural network model is realized by removing channels with small influence on other channels. Therefore, the number of parameters of the initial neural network model is reduced and the calculated amount of the initial neural network model is reduced by pruning the initial neural network model, so that the speed of model operation is improved and the requirement on deployment equipment is reduced while the effect is ensured.

According to the data processing method, the data processing device and the storage medium, the speed of model operation is improved while the effect is ensured, and the requirement on deployment equipment is reduced. For example: aiming at the re-recognition of pedestrians, the method provided by the embodiment of the application is used for pruning the recognition neural network model, so that the pruned model can be applied to a camera, the recognition work can be directly completed on the camera, and the recognition efficiency is improved.

For easy understanding, the technical scheme provided by the application is further described below with reference to the accompanying drawings.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

In an embodiment of the present application, a flowchart of a data processing method is shown in fig. 1: respectively model pre-training, channel pruning based on conditional entropy and training a pruning model.

The model pre-training is to train a neural network model, the model pre-training is a pre-preparation part of the application, and the trained recognition neural network model with multiple channels is used as an initial neural network model of the application.

Channel pruning based on conditional entropy is an important introduction scheme of the present application, and the detailed description of the scheme is provided below.

In an embodiment of the present application, in order to optimize the initial neural network model, as shown in fig. 2, the method specifically includes the following steps:

step 201: and obtaining training samples of the data to be processed, inputting the training samples into the trained initial neural network model, and respectively obtaining feature graphs output by each convolution layer in the initial neural network model.

In an embodiment of the application, the initial neural network model has at least one convolution layer, and each convolution layer has a plurality of channels therein. A training sample and a channel may result in a feature map. For example: the initial neural network model has 3 convolutional layers, each convolutional layer has 10 channels, 5 training samples are input into the initial neural network model, and 150 feature maps can be obtained in total.

It should be noted that the number of channels of each convolution layer may be the same or different, which is not limited in the present application.

Step 202: and calculating the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels in each convolution layer of the initial neural network model.

In the embodiment of the application, the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels is determined by determining the two norms (L2 Norm) of each feature map. The method can be concretely implemented as steps A1-A3:

step A1: and determining a characteristic diagram corresponding to each channel in each convolution layer aiming at each convolution layer of the initial neural network model, and calculating the two norms of the characteristic diagram corresponding to each channel.

In the embodiment of the present application, the two norms are calculated by the following formula:

wherein i=1, 2,3 … … k; (1);

wherein x is _l2 Representing two norms, x _i Representing vectors in the feature map. Therefore, dimension reduction can be realized by calculating the two norms of the feature map, and the calculated amount of subsequent calculation is reduced.

Step A2: dividing the two norms of the feature map corresponding to each channel into a preset number of sections, and determining the probability of each channel in each section according to the number of the two norms of the feature map corresponding to each channel in each section.

In the embodiment of the application, after the two norms of each feature map are obtained, the channel where the two norms are located is determined according to the channel, and the two norms corresponding to each channel are divided according to the numerical value of the two norms, so that the probability of each channel in each interval is determined.

In one embodiment, if a layer of convolution layers has 2 channels, the layer of convolution layers obtains a total of 40 feature maps, i.e., 20 feature maps for each channel. And calculating the two norms of the 40 feature maps, and dividing the intervals according to the numerical values of the two norms of each channel.

If the maximum value of the two norms corresponding to the first channel is 4.0 and the minimum value is 1.0, dividing the area by taking 1.0 and 4.0 as boundaries, and if the area is divided into 3 sections, the range of each section is [1.0, 2.0); [2.0, 3.0); [3.0,4.0]. And determining the probability of the first channel in 3 intervals according to the number of the corresponding two norms of the first channel in each interval. For example: the first interval [1.0, 2.0) has 5 bipartite norms, the probability of the first interval is 0.25; the second interval [2.0, 3.0) has 5 bipartite norms, the probability of the second interval is 0.25; the third interval [3.0,4.0] has 10 bipartite norms, and the probability of the third interval is 0.5.

If the maximum value of the two norms corresponding to the second channel is 6.5 and the minimum value is 0.5, dividing the area by taking 0.5 and 6.5 as boundaries, and if the two norms are divided into 3 sections, the range of each section is [0.5, 2.5); [2.5,4.5); [4.5,6.5]. And determining the probability of the second channel in 3 intervals according to the number of the second norms corresponding to the second channel in each interval. For example: the first interval [0.5, 2.5) has 8 bipartite norms, and the probability of the first interval is 0.4; the second interval [2.5,4.5) has 2 bipartite norms, and the probability of the second interval is 0.1; the third interval [4.5,6.5] has 10 bipartite norms, and the probability of the third interval is 0.5.

This results in probabilities for each section of each channel. It should be noted that the number of divided regions of each channel needs to be kept the same.

Step A3: and determining the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels according to the probability of each channel in each interval and the conditional entropy formula.

In the embodiment of the application, the conditional entropy is determined according to the probability of each channel and the conditional entropy formula. The method can be concretely implemented as steps B1-B3:

step B1: and summing the probabilities of the first channel in each interval to obtain the probability sum of the first channel.

Step B2: and summing the product of the conditional probability of the second channel in each section relative to the probability of the first channel in each section and the logarithm of the conditional probability to obtain the probability sum of the second channel.

Step B3: and taking the negative number of the product of the probability sum of the first channel and the probability sum of the second channel as the conditional entropy of the second channel on the first channel.

In an embodiment of the present application, the conditional entropy is determined according to the following formula:

H(Y|X)＝-∑ _x∈X p(x)∑ _x∈Y p(y|x)logp(y|x)；(2)；

wherein X represents a first channel, Y represents a second channel, and p (X) represents a probability obtained in the first channel; p (y|x) represents the conditional probability of the second channel for the first channel.

In one embodiment, to determine the conditional probability of the second channel for the first channel, the number of bins in the first channel for the second channel's two norms needs to be determined. For example: the first channel is divided into 3 sections, and the range of each section is 1.0 and 2.0; [2.0, 3.0); [3.0,4.0]. The second channel has 20 second norms, and the conditional probability of the second channel to the first channel is determined according to the number of the second norms corresponding to the second channel in each interval. For example: the first interval [1.0, 2.0) has 2 bipartite norms, the conditional probability of the second channel to the first channel in the first interval is 0.1; the second interval [2.0, 3.0) has 4 bipartite norms, the conditional probability of the second channel to the first channel in the second interval is 0.2; the third interval [3.0,4.0] has 8 bipartite norms, and the conditional probability of the second channel to the first channel in the third interval is 0.4.

Thus, the conditional entropy of the second channel to the first channel is determined according to the conditional entropy formula.

In the embodiment of the present application, instead of using two norms to represent the feature map, a one-Norm (L1 Norm) and global averaging pooling (global average pooling) may be used to represent the feature map, which is not limited in the present application.

Step 203: and averaging the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels in each convolution layer aiming at each convolution layer of the initial neural network model to obtain the average conditional entropy between the feature map corresponding to each channel and the feature map corresponding to other channels.

In one embodiment, if there are 4 channels in the convolutional layer, and the conditional entropy of the first channel to the second channel is-0.5; the conditional entropy of the first channel to the third channel is-0.3; the conditional entropy of the first channel to the fourth channel is-0.1; the average conditional entropy of the first channel for the other channels is-0.3.

Step 204: and aiming at each convolution layer of the initial neural network model, pruning the initial neural network model according to the average conditional entropy corresponding to each channel in the convolution layer to obtain an optimized neural network model.

In the embodiment of the application, a certain number of channels can be limited to be pruned, and an optimized neural network model can be obtained, which can be implemented as follows: and pruning a preset number of channels according to the sequence from small entropy to large entropy of the average condition corresponding to each channel.

In one embodiment, if there are 4 channels in the convolutional layer, and the average conditional entropy of the first channel to the other channels is-0.3; the average conditional entropy of the second channel to other channels is-0.5; the average conditional entropy of the third channel to other channels is-0.2; the average conditional entropy of the fourth channel for the other channels is-0.4. If the convolution layer needs to prune 2 channels, pruning the second channel corresponding to-0.5 and the fourth channel corresponding to-0.4 according to the sequence from small to large, thereby completing model optimization and obtaining an optimized neural network model.

Thus, by limiting the number of pruning, the same number of pruning channels of each layer of convolution layer can be ensured.

Of course, instead of limiting pruning to a certain number of channels, a threshold may be set, and when it is determined that the average conditional entropy is not greater than the threshold, the importance of the channel corresponding to the average conditional entropy is considered to be not high, and pruning the channel. In one embodiment, if there are 4 channels in the convolutional layer, and the average conditional entropy of the first channel to the other channels is-0.3; the average conditional entropy of the second channel to other channels is-0.5; the average conditional entropy of the third channel to other channels is-0.2; the average conditional entropy of the fourth channel for the other channels is-0.4. If the limiting threshold is-0.4, comparing the data of the average conditional entropy with the threshold, pruning the second channel corresponding to-0.5 and the fourth channel corresponding to-0.4, and thus completing model optimization and obtaining an optimized neural network model.

It should be noted that if the average conditional entropy of all channels is greater than the threshold value, the channels will not be pruned.

Step 205: and inputting the data to be processed into the optimized neural network model to obtain the processed data output by the optimized neural network model.

Therefore, the number of parameters of the initial neural network model is reduced and the calculated amount of the initial neural network model is reduced by pruning the initial neural network model, so that the speed of model operation is improved and the requirement on deployment equipment is reduced while the effect is ensured.

In order to make or obtain the conditional entropy more visual display, a matrix method may be adopted to arrange the obtained conditional entropy, and the method may be specifically implemented as steps C1-C2:

step C1: and acquiring conditional entropy between the feature map corresponding to each channel and the feature map corresponding to each channel.

In the embodiment of the application, the conditional entropy between the feature map corresponding to each channel and the feature map corresponding to each channel is 0.

Step C2: generating a conditional entropy matrix according to conditional entropy between the feature images corresponding to each channel and the feature images corresponding to other channels and conditional entropy between the feature images corresponding to each channel and the feature images corresponding to the channel; the row and column of the conditional entropy matrix are the number of channels of the convolution layer, and the conditional entropy in each row of the conditional entropy matrix is the conditional entropy between the feature map corresponding to the same channel and the feature maps corresponding to other channels.

In one embodiment, if there are 4 channels in the convolutional layer, 16 conditional entropies can be obtained, respectively: h (1|1), H (1|2), H (1|3), H (1|4), H (2|1), H (2|2), H (2|3), H (2|4), H (3|1), H (3|2), H (3|3), H (3|4), H (4|1), H (4|2), H (4|3), H (4|4). Thus, a conditional entropy matrix of 4*4 is generated:

[H(1|1)，H(1|2)，H(1|3)，H(1|4)；

H(2|1)，H(2|2)，H(2|3)，H(2|4)；

H(3|1)，H(3|2)，H(3|3)，H(3|4)；

H(4|1)，H(4|2)，H(4|3)，H(4|4)]；

thus, when calculating the average conditional entropy of each channel to other channels, the conditional entropy matrix is only required to be summed row by row and divided by the number of conditional entropies of each channel to other channels.

After the section of conditional entropy based channel pruning is introduced, the training of pruning models is explained below.

In order to make the effect of optimizing the neural network model better, the neural network model can be trained, and the method can be concretely implemented as follows: and training the optimized neural network model.

Specifically, the optimization neural network model may be finetune, i.e., parameters in the optimization neural network model may be trimmed by retraining the optimization neural network model.

In the embodiment of the application, the method for retraining the optimized neural network model is the same as the method for pre-training the first partial model, a plurality of training samples with labels are input into the optimized neural network model, an output result is obtained, parameters in the optimized neural network model are adjusted according to the output result and the labels, and when the output result of the optimized neural network model meets the requirement, the retraining of the optimized neural network model is completed. Therefore, the effect of optimizing the neural network model is further improved by retraining the optimized neural network model.

As shown in fig. 3, the embodiment of the present application further provides a complete method for data processing, including:

step 301: and obtaining training samples of the data to be processed, inputting the training samples into the trained initial neural network model, and respectively obtaining feature graphs output by each convolution layer in the initial neural network model.

Step 302: and determining a characteristic diagram corresponding to each channel in each convolution layer aiming at each convolution layer of the initial neural network model, and calculating the two norms of the characteristic diagram corresponding to each channel.

Step 303: dividing the two norms of the feature map corresponding to each channel into a preset number of sections, and determining the probability of each channel in each section according to the number of the two norms of the feature map corresponding to each channel in each section.

Step 304: and determining the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels according to the probability of each channel in each interval and the conditional entropy formula.

Step 305: and acquiring conditional entropy between the feature map corresponding to each channel and the feature map corresponding to each channel.

Step 306: generating a conditional entropy matrix according to conditional entropy between the feature images corresponding to each channel and the feature images corresponding to other channels and conditional entropy between the feature images corresponding to each channel and the feature images corresponding to the channel; the row and column of the conditional entropy matrix are the number of channels of the convolution layer, and the conditional entropy in each row of the conditional entropy matrix is the conditional entropy between the feature map corresponding to the same channel and the feature maps corresponding to other channels.

Step 307: and averaging the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels in each convolution layer aiming at each convolution layer of the initial neural network model to obtain the average conditional entropy between the feature map corresponding to each channel and the feature map corresponding to other channels.

Step 308: and pruning a preset number of channels corresponding to the average conditional entropy in the convolution layers according to the order from small to large aiming at each convolution layer of the initial neural network model to obtain an optimized neural network model.

Step 309: and training the optimized neural network model.

Step 310: inputting the data to be processed into the trained optimized neural network model to obtain the processed data output by the trained optimized neural network model.

Based on the same inventive concept, the embodiment of the application also provides a data processing device. As shown in fig. 4, the apparatus includes:

the feature map obtaining module 401 is configured to obtain training samples of data to be processed, and input the training samples into a trained initial neural network model, and obtain feature maps output by each convolution layer in the initial neural network model respectively;

a first conditional entropy obtaining module 402, configured to calculate, for each convolutional layer of the initial neural network model, conditional entropy between a feature map corresponding to each channel in the convolutional layer and feature maps corresponding to other channels;

The average conditional entropy obtaining module 403 is configured to average, for each convolution layer of the initial neural network model, conditional entropy between a feature map corresponding to each channel and feature maps corresponding to other channels in the convolution layer, so as to obtain average conditional entropy between the feature map corresponding to each channel and the feature map corresponding to other channels;

pruning module 404, configured to prune, for each convolutional layer of the initial neural network model, the initial neural network model according to an average conditional entropy corresponding to each channel in the convolutional layer, to obtain an optimized neural network model;

the data obtaining module 405 is configured to input the data to be processed into the optimized neural network model, and obtain processed data output by the optimized neural network model.

In one embodiment, the pruning module 404 is specifically configured to prune a preset number of channels in order from smaller to larger according to the average conditional entropy corresponding to each channel.

In one embodiment, the first conditional entropy acquisition module 402 includes:

the determining unit is used for determining a characteristic diagram corresponding to each channel in each convolution layer aiming at each convolution layer of the initial neural network model, and calculating the two norms of the characteristic diagram corresponding to each channel;

The probability determining unit is used for dividing the two norms of the feature graphs corresponding to the channels into intervals with preset quantity, and determining the probability of each channel in each interval according to the quantity of the two norms of the feature graphs corresponding to the channels in each interval;

and the conditional entropy determining unit is used for determining the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels according to the probability of each channel in each interval and the conditional entropy formula.

In one embodiment, determining the conditional entropy unit comprises:

In one embodiment, the apparatus further comprises:

The second conditional entropy obtaining module is configured to, for each convolutional layer of the initial neural network model, calculate conditional entropy between a feature map corresponding to each channel in the convolutional layer and feature maps corresponding to other channels, and then obtain conditional entropy between the feature map corresponding to each channel and the feature map corresponding to the first conditional entropy obtaining module 402;

In one embodiment, the apparatus further comprises:

the training module is configured to, for each convolution layer of the initial neural network model, perform pruning processing on the initial neural network model according to average conditional entropy corresponding to each channel in the convolution layer, obtain an optimized neural network model, and then train the optimized neural network model.

Based on the same technical concept, the embodiment of the present application further provides a terminal device 500, and referring to fig. 5, the terminal device 500 is configured to implement the methods described in the above embodiments of the methods, for example, implement the embodiment shown in fig. 2, where the terminal device 500 may include a memory 501, a processor 502, an input unit 503, and a display panel 504.

A memory 501 for storing a computer program for execution by the processor 502. The memory 501 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the terminal device 500, and the like. The processor 502 may be a central processing unit (central processing unit, CPU), or a digital processing unit, etc. An input unit 503 may be used to obtain a user instruction input by a user. The display panel 504 is configured to display information input by a user or information provided to the user, and in the embodiment of the present application, the display panel 504 is mainly configured to display interfaces of applications in the terminal device and control entities displayed in the display interfaces. Alternatively, the display panel 504 may be configured in the form of a liquid crystal display (liquid crystal display, LCD) or an OLED (organic light-emitting diode) or the like.

The specific connection medium between the memory 501, the processor 502, the input unit 503, and the display panel 504 is not limited in the embodiment of the present application. In the embodiment of the present application, in fig. 5, the memory 501, the processor 502, the input unit 503, and the display panel 504 are connected by a bus 505, where the bus 505 is indicated by a thick line in fig. 5, and the connection manner between other components is only schematically illustrated, but not limited thereto. The bus 505 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 5, but not only one bus or one type of bus.

The memory 501 may be a volatile memory (RAM), such as a random-access memory (RAM); the memory 501 may also be a non-volatile memory (non-volatile memory), such as a read-only memory, a flash memory (flash memory), a Hard Disk Drive (HDD) or a Solid State Drive (SSD), or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. Memory 501 may be a combination of the above.

A processor 502 for implementing the embodiment shown in fig. 2, comprising:

a processor 502 for invoking a computer program stored in memory 501 to perform the embodiment shown in fig. 2.

The embodiment of the application also provides a computer readable storage medium which stores computer executable instructions required to be executed by the processor and contains a program for executing the processor.

In some possible embodiments, aspects of a method of biometric identity lookup provided by the present application may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps of a method of biometric identity lookup according to the various exemplary embodiments of the application as described herein above when the program product is run on the terminal device. For example, the terminal device may perform the embodiment shown in fig. 2.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A program product for a biometric query in accordance with embodiments of the present application may employ a portable compact disc read only memory (CD-ROM) and include program code and may be run on a computing device. However, the program product of the present application is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an entity oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such a division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the elements described above may be embodied in one element in accordance with embodiments of the present application. Conversely, the features and functions of one unit described above may be further divided into a plurality of units to be embodied.

Furthermore, although the operations of the methods of the present application are depicted in the drawings in a particular order, this is not required to either imply that the operations must be performed in that particular order or that all of the illustrated operations be performed to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A data processing method, applied to a camera, the method comprising:

inputting the acquired image to be identified into an optimized neural network model to obtain an object re-identification result output by the optimized neural network model; wherein the optimized neural network model is obtained by:

acquiring a training image containing an object, inputting the training image into a trained initial neural network model, and respectively acquiring feature images output by each convolution layer in the initial neural network model; the feature map refers to image features generated after convolution of the training image and a convolution kernel in the initial neural network model, and one feature map can be obtained by one training image and one channel of the convolution layer;

and respectively inputting a plurality of training images with labels into the optimized neural network model to obtain an output result, and adjusting parameters in the optimized neural network model according to the output result and the labels until the output result of the optimized neural network model meets the preset condition, so as to complete the training of the optimized neural network model.

2. The method according to claim 1, wherein pruning the initial neural network model according to the average conditional entropy corresponding to each channel in the convolutional layer comprises:

and pruning a preset number of channels according to the sequence from small entropy to large entropy of the average condition corresponding to each channel.

3. The method according to claim 1, wherein for each convolution layer of the initial neural network model, calculating a conditional entropy between a feature map corresponding to each channel in the convolution layer and feature maps corresponding to other channels includes:

determining a characteristic diagram corresponding to each channel in each convolution layer aiming at each convolution layer of the initial neural network model, and calculating a double norm of the characteristic diagram corresponding to each channel;

dividing the two norms of the feature map corresponding to each channel into a preset number of sections, and determining the probability of each channel in each section according to the number of the two norms of the feature map corresponding to each channel in each section;

and determining the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels according to the probability of each channel in each interval and the conditional entropy formula.

4. A method according to claim 3, wherein the conditional entropy between the feature map corresponding to each channel and the feature maps corresponding to other channels according to the probability of each channel in each interval and the conditional entropy formula includes:

summing the probability of the first channel in each interval to obtain the probability sum of the first channel; the method comprises the steps of,

Summing the product of the conditional probability of the second channel in each section relative to the first channel in each section and the logarithm of the conditional probability to obtain the probability sum of the second channel;

and taking the negative number of the product of the probability sum of the first channel and the probability sum of the second channel as the conditional entropy of the second channel on the first channel.

5. The method according to claim 1, wherein after calculating, for each convolution layer of the initial neural network model, conditional entropy between a feature map corresponding to each channel and feature maps corresponding to other channels in the convolution layer, the method further comprises:

acquiring conditional entropy between the feature images corresponding to the channels and the feature images corresponding to the channels;

generating a conditional entropy matrix according to conditional entropy between the feature images corresponding to each channel and the feature images corresponding to other channels and conditional entropy between the feature images corresponding to each channel and the feature images corresponding to the channel; the row and column of the conditional entropy matrix are the number of channels of the convolution layer, and the conditional entropy in each row of the conditional entropy matrix is the conditional entropy between the feature map corresponding to the same channel and the feature maps corresponding to other channels.

6. A data processing apparatus for use with a camera, the apparatus comprising:

the data acquisition module is used for inputting the acquired image to be identified into the optimized neural network model to obtain an object re-identification result output by the optimized neural network model; wherein the optimized neural network model is obtained by:

the feature map acquisition module is used for acquiring a training image containing an object, inputting the training image into a trained initial neural network model, and respectively acquiring feature maps output by all convolution layers in the initial neural network model; the feature map refers to image features generated after convolution of the training image and a convolution kernel in the initial neural network model, and one feature map can be obtained by one training image and one channel of the convolution layer;

7. The apparatus of claim 6, wherein the first conditional entropy acquisition module comprises:

8. A computer readable medium storing computer executable instructions for performing the method of any one of claims 1-5.

9. A computing device, comprising:

at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.