CN116630724B

CN116630724B - Data model generation method, image processing method, device and chip

Info

Publication number: CN116630724B
Application number: CN202310907961.XA
Authority: CN
Inventors: 唐剑; 刘宁; 张法朝
Original assignee: Midea Robozone Technology Co Ltd
Current assignee: Midea Robozone Technology Co Ltd
Priority date: 2023-07-24
Filing date: 2023-07-24
Publication date: 2023-10-10
Anticipated expiration: 2043-07-24
Also published as: CN116630724A

Abstract

The invention relates to the technical field of neural network model compression, and provides a data model generation method, an image processing device and a chip. The data model generation method comprises the following steps: acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model; performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier; a fourth data model is determined based on the second auxiliary classifier and the third data model.

Description

Data model generation method, image processing method, device and chip

Technical Field

The present invention relates to the field of neural network model compression technology, and in particular, to a method and apparatus for generating a data model, and an image processing method and apparatus, and a chip.

Background

The technology of data-free knowledge distillation is widely studied at the present stage, the data-free knowledge distillation is to extract information from a trained network to generate an image, and then the generated image is used to transmit teacher network knowledge to student network, and the process is continuously circulated, so that the compression of a large model into a small model is realized under the condition of not contacting an original data set.

However, the existing distillation technology without data knowledge has the problems of low model precision, poor model recognition accuracy and the like.

Disclosure of Invention

The invention aims to at least solve the technical problems of lower model precision, poorer model identification accuracy and the like in the prior art or related technologies.

To this end, a first aspect of the present invention is to propose a method of generating a data model.

A second aspect of the present invention is to provide an image processing method.

A third aspect of the present invention is to provide a data model generating apparatus.

A fourth aspect of the invention is directed to a readable storage medium.

A fifth aspect of the invention is directed to a computer program product.

A sixth aspect of the invention is directed to a chip.

In view of this, according to a first aspect of the present invention, there is provided a data model generation method including: acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model; performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier; a fourth data model is determined based on the second auxiliary classifier and the third data model.

According to the data model generation method, the second data model is subjected to data training for a plurality of rounds according to the training image set and the first data model, a third data model is generated, the first auxiliary classifier is trained and optimized according to the first data model and the training image set, the second auxiliary classifier corresponding to the first auxiliary classifier is determined, the output of the third data model is limited according to the second auxiliary classifier, so that a fourth data model is obtained, the model precision of the fourth data model is improved, the recognition accuracy of the fourth data model is guaranteed, meanwhile, the generation step of the fourth data model is optimized, the model generation efficiency of the fourth data model is improved, the application range of the fourth data model is expanded, and the application scene of the fourth data model is enriched.

According to a second aspect of the present invention, there is provided an image processing method comprising: acquiring an image set to be detected and an image classification model; and classifying the image set to be detected through an image classification model to obtain a target image set, wherein the image classification model is a data model determined through the generation method of the data model in any technical scheme.

According to the image processing method in the technical scheme, the image collection to be detected is classified through the image classification model, the target image collection in the image collection to be detected is determined, the classification accuracy of the image collection to be detected is improved through the image classification model, and the data accuracy of the target image collection is further guaranteed.

According to a third aspect of the present invention, there is provided a data model generating apparatus, comprising: the first processing module is used for acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model; the first processing module is further used for carrying out data training on the first auxiliary classifier through the first data model and the training image set so as to obtain a second auxiliary classifier; the first processing module is further configured to determine a fourth data model according to the second auxiliary classifier and the third data model.

According to the data model generating device in the technical scheme, the second data model is subjected to data training for a plurality of rounds according to the training image set and the first data model to generate the third data model, the first auxiliary classifier is trained and optimized according to the first data model and the training image set to determine the second auxiliary classifier corresponding to the first auxiliary classifier, and the output of the third data model is limited according to the second auxiliary classifier to obtain the fourth data model, so that the model precision of the fourth data model is improved, the recognition accuracy of the fourth data model is ensured, meanwhile, the generating step of the fourth data model is optimized, the model generating efficiency of the fourth data model is improved, the application range of the fourth data model is expanded, and the application scene of the fourth data model is enriched.

According to a fourth aspect of the present invention, there is provided a readable storage medium having stored thereon a program or instructions which, when executed by a processor, implement the method of generating a data model in any of the above-described aspects or the method of processing an image in any of the above-described aspects. Therefore, the readable storage medium has all the advantages of the method for generating a data model in any of the above-mentioned technical solutions or the method for processing an image in any of the above-mentioned technical solutions, and will not be described in detail herein.

According to a fifth aspect of the present invention, a computer program product is presented, comprising computer instructions which, when executed by a processor, implement a method of generating a data model as in any of the above-mentioned aspects or an image processing method as in any of the above-mentioned aspects. Therefore, the computer program product has all the advantages of the method for generating a data model in any of the above-mentioned technical solutions or the method for processing an image in any of the above-mentioned technical solutions, and will not be described in detail herein.

According to a sixth aspect of the present invention, a chip is provided, where the chip includes a program or an instruction, and when the chip is running, the chip is used to implement all the beneficial effects of the method for generating a data model in any of the above-mentioned technical solutions or the method for processing an image in any of the above-mentioned technical solutions, which are not described herein again.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 shows one of the flow diagrams of a method of generating a data model in an embodiment of the invention;

FIG. 2 is a second flow chart of a method for generating a data model according to an embodiment of the invention;

FIG. 3 is a third flow chart of a method for generating a data model in an embodiment of the invention;

FIG. 4 is a flow chart showing a method of generating a data model in an embodiment of the present invention;

FIG. 5 is a flow chart showing a method of generating a data model in an embodiment of the present invention;

FIG. 6 is a flowchart of a method for generating a data model in an embodiment of the invention;

FIG. 7 is a flow chart of a method of generating a data model in an embodiment of the invention;

FIG. 8 is a flowchart of a method for generating a data model in an embodiment of the invention;

FIG. 9 shows a ninth flow chart of a method of generating a data model in an embodiment of the invention;

FIG. 10 shows one of the block diagrams of the structure of the generation apparatus of the data model in the embodiment of the present invention;

FIG. 11 shows a schematic diagram of a generation apparatus of a data model in an embodiment of the invention;

FIG. 12 is a block diagram showing a second construction of the data model generating apparatus in the embodiment of the present invention;

fig. 13 shows a flow chart of an image processing method in the embodiment of the present invention;

fig. 14 shows one of block diagrams of the structure of an image processing apparatus in the embodiment of the present invention;

fig. 15 shows a second block diagram of the image processing apparatus in the embodiment of the present invention.

Detailed Description

In order that the above-recited objects, features and advantages of the present invention will be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the embodiments of the present invention and features in the embodiments may be combined with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those described herein, and the scope of the invention is therefore not limited to the specific embodiments disclosed below.

The method for generating a data model, the method for processing an image, the device and the chip according to the embodiments of the present application are described in detail below with reference to fig. 1 to 15 by means of specific embodiments and application scenarios thereof.

The execution main body of the technical scheme of the data model generation method provided by the application can be a generation device, and can also be determined according to actual use requirements, and the method is not particularly limited. In order to describe the data model generating method provided by the application more clearly, the following description will be made with the generating device as the execution subject.

In one embodiment according to the present application, as shown in fig. 1, a method for generating a data model is provided, where the method for generating a data model includes:

step 102, acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model;

104, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

and 106, determining a fourth data model according to the second auxiliary classifier and the third data model.

In this embodiment, a method for generating a data model is provided, where the generating device acquires a first data model, a second data model, a training image set, and a first auxiliary classifier, respectively, where the first data model is a fully trained data model, the second data model is an untrained data model, the training image set is a data image for model training, and the first auxiliary classifier is an existing auxiliary classifier.

Illustratively, the first data model may be embodied as a fully trained teacher network and the second data model may be embodied as an untrained student network.

The generating device performs data training on the second data model for a plurality of rounds according to the training image set and the first data model, and generates a third data model, wherein the third data model is a model trained by the second data model.

The third data model may be embodied, for example, as a trained student network.

The generating device trains and optimizes the first auxiliary classifier according to the first data model and the training image set, and determines a second auxiliary classifier corresponding to the first auxiliary classifier, wherein the second auxiliary classifier is an auxiliary classifier after optimization, and the auxiliary classifier is used for data classification.

The second auxiliary classifier may be embodied as an optimized auxiliary classifier, which may have a higher accuracy, for example.

The generating device limits the output of the third data model according to the second auxiliary classifier to obtain a fourth data model, wherein the fourth data model is a data model with converged output.

The fourth data model may be embodied as a high-precision small network, for example.

Illustratively, a fourth data model may be used for identification and classification of images.

The fourth data model may be applied to privacy and security sensitive fields, such as medical fields, and the method for generating the data model in the present embodiment may be applied when the small network needs to be arranged to the medical edge device, but only provides the large network weight parameter, and fails to provide the customer privacy data.

According to the method for generating the data model, the second data model is subjected to data training for a plurality of rounds according to the training image set and the first data model, a third data model is generated, the first auxiliary classifier is trained and optimized according to the first data model and the training image set, the second auxiliary classifier corresponding to the first auxiliary classifier is determined, the output of the third data model is limited according to the second auxiliary classifier, so that a fourth data model is obtained, the model precision of the fourth data model is improved, the recognition accuracy of the fourth data model is guaranteed, meanwhile, the generation step of the fourth data model is optimized, the model generation efficiency of the fourth data model is improved, the application range of the fourth data model is expanded, and the application scene of the fourth data model is enriched.

In one embodiment according to the present application, as shown in fig. 2, a method for generating a data model is provided, where the method for generating a data model includes:

step 202, acquiring a first image generator, and generating a first image through the first image generator;

step 204, determining a first loss function according to the characteristic parameters of the first image and a preset category;

step 206, performing data training on the first image generator through the first loss function to obtain a second image generator;

step 208, generating a training image set through a second image generator;

step 210, acquiring a first data model, a second data model and a first auxiliary classifier, and performing data training on the second data model through training the image set and the first data model to obtain a third data model;

step 212, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

a fourth data model is determined 214 based on the second auxiliary classifier and the third data model.

In this embodiment, the generating means acquires a first image generator, which is a module capable of generating an image, and generates a first image by the first image generator, which is an image output by the first image generator.

The first image generator may be embodied as an existing image generator, and may generate an image by inputting random noise.

The generating device determines characteristic parameters and preset categories corresponding to the first image, and then determines a first loss function according to the characteristic parameters and the preset categories, wherein the characteristic parameters are image characteristics corresponding to the first image, the preset categories are image types preset for the first image, and the first loss function is a loss function containing cross entropy.

For example, the class of the generated picture may be preset through the first loss function, and the intermediate feature of the picture may also be heterogeneous.

The generating device performs data training on the output image of the first image generator through the first loss function, and further determines a second image generator, wherein the second image generator is an image generator after training optimization.

The second image generator may be embodied as an image generator capable of generating a high information content image, for example.

The generating means generates image data by means of a second image generator to obtain a training image set.

For example, the training image set may include a data set of high information content images.

According to the method for generating the data model in the embodiment, the first image is generated through the first image generator, the first loss function is determined according to the characteristic parameters and the preset category of the first image, data convergence is conducted on the output image of the first image generator through the first loss function, the second image generator is determined, and the image data is generated through the second image generator, so that a training image set is obtained, the image quality of the training image set is improved, meanwhile, the data precision of the training image set is guaranteed, and further the model precision of the fourth data model is guaranteed.

In one embodiment according to the present application, as shown in fig. 3, a method for generating a data model is provided, where the method for generating a data model includes:

step 302, acquiring randomly generated noise data, and inputting the noise data into a first image generator to obtain a first image;

step 304, determining a first loss function according to the characteristic parameters of the first image and a preset category;

step 306, performing data training on the first image generator through the first loss function to obtain a second image generator;

step 308, generating a training image set through a second image generator;

Step 310, acquiring a first data model, a second data model and a first auxiliary classifier, and performing data training on the second data model through training the image set and the first data model to obtain a third data model;

step 312, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

a fourth data model is determined 314 based on the second auxiliary classifier and the third data model.

In this embodiment, the generating means acquires noise data, which is randomly generated data, and inputs the noise data to the first image generator to thereby obtain the first image.

The noise data may be embodied as, for example, randomly generated noise images.

The method for generating the data model in the embodiment inputs noise data into the first image generator to obtain the first image, so that the data quantity of the first image is enriched.

In one embodiment according to the present application, as shown in fig. 4, a method for generating a data model is provided, where the method for generating a data model includes:

step 402, acquiring a first image generator, and generating a first image through the first image generator;

Step 404, determining a first loss function according to the characteristic parameters of the first image and a preset category;

step 406, performing data convergence processing on the first image output by the first image generator according to the first loss function to obtain a second image;

step 408, completing data training of the first image generator to obtain a second image generator under the condition that the information amount of the second image is larger than a preset threshold value;

step 410, generating a training image set through a second image generator;

step 412, acquiring a first data model, a second data model and a first auxiliary classifier, and performing data training on the second data model by training the image set and the first data model to obtain a third data model;

step 414, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

a fourth data model is determined based on the second auxiliary classifier and the third data model, step 416.

In this embodiment, the generating device acquires a first image output by the first image generator, performs data convergence on the first image according to the first loss function, and determines a second image, where the second image is an image obtained by optimizing the first image.

The second image may be embodied as a higher information content image, for example.

The generating device determines the information quantity of the second image, and when the information quantity of the second image is larger than a preset threshold value, the generating device indicates that the second image is an image capable of training, and the generating device completes data training of the first image generator, so as to further determine the second image generator, wherein the preset threshold value is a threshold value parameter preset by the information quantity of the second image, and the information quantity of the second image is a parameter used for representing the information content of the second image.

For example, in the case where the information amount of the second image is greater than the preset threshold, the second image may be determined to be a picture capable of conveying knowledge.

According to the data model generation method in the embodiment, data convergence is carried out on the first image according to the first loss function, the second image is determined, and under the condition that the information quantity of the second image is larger than a preset threshold value, data training of the first image generator is completed, and then the second image generator is determined, so that the data accuracy of the second image output by the second image generator is ensured, and further, the model accuracy of the third data model is ensured.

In one embodiment according to the application, the first penalty function includes a first penalty term, a second penalty term, and a third penalty term, the first penalty term L _H The calculation formula of (2) is as follows:

L _H =CE(h(f _T ,θ _h ),t)；

wherein CE is a cross entropy loss function, h is a first auxiliary classifier, f _T Is the characteristic parameter of the first image, θ _h The weight parameter of the first auxiliary classifier is t, and the t is a preset category of the first image;

second loss term L _bn The calculation formula of (a) is specifically as follows:

；

wherein x is an input image, N is a number of parameters, Σ is a sum symbol, and [ mu ] _i Sum sigma _i I is the image sequence number,and->Is a standard parameter;

third loss term L _adv The calculation formula of (a) is specifically as follows:

；

wherein x is the input image, KL is the divergence, T is the first data model, S is the second data model, τ is the temperature of the knowledge distillation,is the standard value of the image, theta _t θ, the network parameter of the first data model _s Is a network parameter of the second data model.

In this embodiment, the first penalty function includes a first penalty term, a second penalty term, and a third penalty term, wherein the first penalty term, the second penalty term, and the third penalty term are penalty terms in the first penalty function.

The method for generating the data model in the embodiment defines that the first loss function includes a first loss term, a second loss term and a third loss term, and expands the application range of the first loss function.

In one embodiment according to the present application, as shown in fig. 5, a method for generating a data model is provided, where the method for generating a data model includes:

step 502, acquiring a first image generator, and generating a first image through the first image generator;

step 504, determining a first loss function according to the characteristic parameters of the first image and a preset category;

step 506, performing data training on the first image generator through the first loss function to obtain a second image generator;

step 508, storing the image output by the second image generator in a database to obtain an image data set;

step 510, data sampling is performed on the image data set to obtain a training image set;

step 512, acquiring a first data model, a second data model and a first auxiliary classifier, and performing data training on the second data model through training the image set and the first data model to obtain a third data model;

step 514, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

a fourth data model is determined 516 based on the second auxiliary classifier and the third data model.

In this embodiment, the generating means acquires the image output by the second image generator, stores the image in the database, and forms an image data set, wherein the image data set is a data set containing the image output by the second image generator.

For example, the generating means may store the image output by the second image generator in the storage pool.

The generating means extracts a partial image in the image data set to obtain a training image set.

For example, the generating device may extract an image with a higher information content from the image data set, and further determine the training image set.

According to the data model generation method in the embodiment, the images are stored in the database to form the image data set, and partial images are extracted from the image data set to obtain the training image set, so that the data accuracy of the training image set is ensured, and the model accuracy of the third data model is further ensured.

In one embodiment according to the present application, as shown in fig. 6, a method for generating a data model is provided, where the method for generating a data model includes:

step 602, acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and setting the training image set as first input data of the second data model;

Step 604, determining a second loss function according to the first data model and the second data model;

step 606, performing model training on the second data model according to the first data model and the first input data to obtain a trained second data model;

step 608, performing data convergence on the trained second data model according to the second loss function to obtain a third data model;

step 610, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

a fourth data model is determined based on the second auxiliary classifier and the third data model, step 612.

In this embodiment, the generating means sets the training image set as input data of the second data model, resulting in first input data, wherein the first input data is input data of the second data model.

Illustratively, the generating means sets the training image set as a generating input of the second data model.

The generating device determines a second loss function according to the first data model and the second data model, wherein the second loss function is a loss function for training the second data model.

Illustratively, a second loss function L _KD The formula of (c) may be specifically:

；

wherein x is an input image, KL is a divergence, T is a teacher network, S is a student network, τ is a temperature of knowledge distillation, θ _t Network parameter theta of teacher network _s Is a network parameter of the student network.

The generating device performs data training on the second data model according to the first data model and the first input data to obtain a trained second data model, and converges the output of the trained second data model according to the second loss function to obtain a third data model.

Illustratively, in the case where the trained second data model does not converge, the training image set is reacquired, and the second data model is retrained.

According to the method for generating the data model, the training image set is set to be input data of the second data model, so that first input data are obtained, the second data model is subjected to data training according to the first data model and the first input data, the trained second data model is obtained, output of the trained second data model is converged according to the second loss function, a third data model is obtained, model precision of the third data model is guaranteed, and model precision of the fourth data model is guaranteed.

In one embodiment according to the present application, as shown in fig. 7, a method for generating a data model is provided, where the method for generating a data model includes:

step 702, acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model;

step 704, setting model features of the first data model as second input data of the first auxiliary classifier;

step 706, determining a third loss function according to the first data model and the second data model;

step 708, performing data training on the first auxiliary classifier according to the second input data and the training image set to obtain a trained first auxiliary classifier;

step 710, performing data convergence on the trained first auxiliary classifier according to the third loss function to obtain a second auxiliary classifier;

a fourth data model is determined 712 based on the second auxiliary classifier and the third data model.

In this embodiment, the generating means sets the model feature of the first data model as the input data of the first auxiliary classifier, and obtains the second input data, wherein the model feature is a feature parameter of the first data model, and the second input data is the input data of the first auxiliary classifier.

The generating means may be configured to set model features of the first data model as training input items of the first auxiliary classifier.

The generating means determines a third loss function based on the first data model and the second data model, wherein the third loss function is a loss function for the first auxiliary classifier.

Illustratively, a third loss function L _KD-H The formula of (c) may be specifically:

；

wherein x is an input image, KL is a divergence, T is a teacher network, h is a first auxiliary classifier, and f _T For the preset category of the first image, τ is the temperature of the knowledge distillation, θ _t Network parameter theta of teacher network _h Is a characteristic parameter of the first auxiliary classifier.

The generating device trains and optimizes the first auxiliary classifier according to the second input data and the training image set to obtain a trained first auxiliary classifier, and then converges data of the output of the trained first auxiliary classifier according to a third loss function to further determine a second auxiliary classifier.

Illustratively, in the case that the output of the trained first auxiliary classifier does not converge, the training image set is reacquired, and the first auxiliary classifier is retrained.

According to the data model generation method, model features of the first data model are set to be input data of the first auxiliary classifier, second input data are obtained, the first auxiliary classifier is trained and optimized according to the second input data and the training image set, the trained first auxiliary classifier is obtained, data convergence is conducted on output of the trained first auxiliary classifier according to the third loss function, the second auxiliary classifier is further determined, data accuracy of the second auxiliary classifier is guaranteed, and model accuracy of the fourth data model is further guaranteed.

In one embodiment according to the present application, as shown in fig. 8, a method for generating a data model is provided, where the method for generating a data model includes:

step 802, acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model;

step 804, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

Step 806, performing data convergence on the output of the third data model by the second auxiliary classifier to obtain a fourth data model.

In this embodiment, the generating means performs convergence processing on the output of the third data model by the second auxiliary classifier to obtain a fourth data model.

The first auxiliary classifier is retrained, illustratively, in the event that the third data model does not converge, to obtain a new second auxiliary classifier.

According to the data model generation method in the embodiment, the second auxiliary classifier is used for carrying out convergence processing on the output of the third data model to obtain the fourth data model, so that the model precision of the fourth data model is guaranteed.

In one embodiment according to the application, the first data model is a trained data model, the second data model is an untrained data model, the third data model is a data model obtained by training the second data model, and the fourth data model is a data model obtained by performing convergence processing on the third data model.

Illustratively, the first data model may be a trained teacher network, the second data model may be an untrained student network, the third data model may be a trained student network, and the fourth data model may be a converged student network.

The technical scheme provided by the application can be applied to different side systems such as linux/rtos/android/ios and the like, and provides instruction level acceleration for different side platforms such as armv7/v8, dsp and the like. The technical scheme of the application has the characteristics of light-weight deployment, strong universality, strong usability, high-performance reasoning and the like, comprehensively solves the low-resource bottleneck of the intelligent equipment, greatly shortens the AI model deployment period, and achieves the industry leading level in the side AI deployment field. The technical scheme provided by the application can be applied to a self-grinding chip, for example, the first three-in-one chip FL119 supporting voice, connection and display in the industry. The related achievements have comprehensively energized intelligent household electric quantity production land of voice refrigerators, air conditioners, robots and the like, and are intelligent and synergistic.

At present, deep learning has achieved remarkable achievements in the fields of image classification, object detection, natural language processing and the like. With the pursuit of precision, deep learning models become increasingly huge, which increases model training and reasoning time and makes deployment to edge devices difficult. To solve this problem, model compression algorithms have been widely studied, which can compress a large model into a small model with little influence on accuracy. Knowledge distillation is also an important algorithm in model compression algorithms, and the knowledge distillation extracts 'knowledge' from a huge teacher network and transmits the 'knowledge' to a simplified student network. Most current model compression algorithms require support for the original data set. These datasets are difficult to obtain in some privacy, security-oriented areas. For example, in the medical field, patient data is private information and is difficult to provide. But a sufficiently trained large model may be provided that does not involve user privacy. At this point, how to avoid the original dataset, training only a small model with a large model is a problem.

To solve this problem, distillation has not been widely studied. The data-free knowledge distillation firstly extracts information from a trained network to generate an image, then the generated image is utilized to transmit teacher network knowledge to a student network, and the process is continuously circulated, so that the student network can obtain higher precision, and the large model is compressed to the small model under the condition of not contacting an original data set. Although no known distillation technology is currently being widely studied, there are some drawbacks, such as the generation of pictures with characteristics of homogeneous characteristics in the middle of the teacher network, which are not very consistent with the properties exhibited by real pictures (i.e. characteristic heterogeneity in the middle of the teacher network).

The scheme provided by the application can better solve the problem, so that the intermediate characteristics of the generated pictures are heterogenized and are closer to the properties of real data. The application discloses a method for protecting intermediate layer characteristic heterogeneity, which comprises the steps of firstly splicing intermediate characteristics and final characteristics of a teacher network into an integral characteristic, taking the integral characteristic as input, and classifying the integral characteristic through an auxiliary classifier, wherein the class is a preset class for generating pictures. The auxiliary classifier is initialized together with the student network at the beginning, weight parameters of the auxiliary classifier are fixed in the process of generating the picture, and noise and a generator are trained through the following loss items.

L _H =CE(h(f _T ,θ _h ),t)；

Wherein CE is a cross entropy loss function, h is a first auxiliary classifier, f _T Is the characteristic parameter of the first image, θ _h And t is a preset category of the first image and is a weight parameter of the first auxiliary classifier. Through the loss item, the category of the generated picture can be preset, and the middle characteristics of the picture can be also heterogenized. In the knowledge distillation link, the auxiliary classifier is trained with the student network by generating pictures.

The method can effectively heterogenize the intermediate characteristics of the generated pictures in the teacher network, and the intermediate characteristics t-sne of various pictures have better visual results. Knowledge distillation is performed by using the scheme, and forward results are obtained in various data sets and networks.

According to the technical scheme provided by the application, a high-precision small network can be obtained through distillation without data knowledge, model compression methods such as pruning and quantization can be performed under the framework, an original redundant network is regarded as a teacher network, and a compressed structure (namely a low-bit model or a pruned sparse network) is regarded as a student network.

The technical scheme provided by the application can be applied to the fields of privacy and safety sensitivity, such as the medical field, and the data set is always highly private and cannot be disclosed externally. In the case where small networks need to be deployed to medical edge segment devices, but only large network weight parameters are provided, and customer privacy data is not provided, no knowledge distillation can be applied.

In one embodiment according to the present application, as shown in fig. 9, a method for generating a data model is provided, where the method for generating a data model includes:

step 902, obtaining model training data, and performing model training according to the model training data to obtain a first data model;

step 904, acquiring a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model;

step 906, performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

a fourth data model is determined based on the second auxiliary classifier and the third data model, step 908.

In this embodiment, the generating means acquires model training data, performs data training based on the model training data, and creates the first data model, wherein the model training data is generation data for generating the first data model.

The model training data may be, for example, in particular image data for generating the first data model.

The data model generating method in the embodiment performs data training according to the model training data, so that a first data model is created, and the model accuracy of the first data model is ensured.

As shown in fig. 10, in an embodiment of the present invention, there is provided a data model generating apparatus 1000, where the data model generating apparatus 1000 includes:

the first processing module 1002 is configured to obtain a first data model, a second data model, a training image set, and a first auxiliary classifier, and perform data training on the second data model through the training image set and the first data model to obtain a third data model;

the first processing module 1002 is further configured to perform data training on the first auxiliary classifier through the first data model and the training image set, so as to obtain a second auxiliary classifier;

the first processing module 1002 is further configured to determine a fourth data model according to the second auxiliary classifier and the third data model.

In this embodiment, a generating device 1000 of a data model is provided, where a first processing module 1002 obtains a first data model, a second data model, a training image set and a first auxiliary classifier respectively, where the first data model is a fully trained data model, the second data model is an untrained data model, the training image set is a data image for model training, and the first auxiliary classifier is an existing auxiliary classifier.

The first processing module 1002 performs data training on the second data model for multiple rounds according to the training image set and the first data model, and generates a third data model, where the third data model is a model trained by the second data model.

The first processing module 1002 trains and optimizes the first auxiliary classifier according to the first data model and the training image set, and determines a second auxiliary classifier corresponding to the first auxiliary classifier, where the second auxiliary classifier is an auxiliary classifier after optimization.

The first processing module 1002 defines an output of the third data model according to the second auxiliary classifier to obtain a fourth data model, where the fourth data model is a data model with converged output.

For example, as shown in fig. 11, the first processing module 1002 may prepare models such as a first data model and a second data model, initialize images, generate images, obtain training images, save and sample the training images, obtain a training image set, train a Student network and an auxiliary classifier through the training image set, and output the Student network in the case that Student network Student (Student) converges.

The data model generating device 1000 in this embodiment performs multiple rounds of data training on the second data model according to the training image set and the first data model to generate a third data model, performs training and optimization on the first auxiliary classifier according to the first data model and the training image set, determines a second auxiliary classifier corresponding to the first auxiliary classifier, and defines the output of the third data model according to the second auxiliary classifier to obtain a fourth data model, thereby improving the model precision of the fourth data model, ensuring the recognition accuracy of the fourth data model, optimizing the generating step of the fourth data model, and improving the model generating efficiency of the fourth data model.

In one embodiment according to the present application, the data model generating apparatus 1000 further includes:

the first processing module 1002 is further configured to obtain a first image generator, and generate a first image through the first image generator;

the first processing module 1002 is further configured to determine a first loss function according to the feature parameter of the first image and a preset class;

the first processing module 1002 is further configured to perform data training on the first image generator through the first loss function, so as to obtain a second image generator;

the first processing module 1002 is further configured to generate, by means of the second image generator, a training image set.

The data model generating device 1000 in this embodiment generates a first image through a first image generator, determines a first loss function according to a feature parameter and a preset category of the first image, performs data convergence on an output image of the first image generator through the first loss function, determines a second image generator, and generates image data through the second image generator to obtain a training image set, so that the image quality of the training image set is improved, the data precision of the training image set is ensured, and the model precision of a fourth data model is further ensured.

the first processing module 1002 is further configured to acquire randomly generated noise data, and input the noise data into the first image generator to obtain a first image.

The data model generating apparatus 1000 in this embodiment inputs noise data into the first image generator to obtain the first image, enriching the data amount of the first image.

the first processing module 1002 is further configured to perform data convergence processing on the first image output by the first image generator according to the first loss function, so as to obtain a second image;

the first processing module 1002 is further configured to complete data training for the first image generator to obtain the second image generator if the information amount of the second image is greater than a preset threshold.

The data model generating device 1000 in this embodiment determines the first loss function through the feature parameters of the first image and the preset category of the first image, and expands the application range of the first loss function.

The first processing module 1002 is further configured to store the image output by the second image generator in a database, so as to obtain an image data set;

the first processing module 1002 is further configured to perform data sampling on the image data set to obtain a training image set.

The data model generating device 1000 in this embodiment forms an image data set by storing images in a database, and extracts part of the images in the image data set to obtain a training image set, so as to ensure the data accuracy of the training image set and further ensure the model accuracy of the third data model.

the first processing module 1002 is further configured to set the training image set to the first input data of the second data model;

the first processing module 1002 is further configured to determine a second loss function according to the first data model and the second data model;

the first processing module 1002 is further configured to perform model training on the second data model according to the first data model and the first input data, so as to obtain a trained second data model;

the first processing module 1002 is further configured to perform data convergence on the trained second data model according to the second loss function, so as to obtain a third data model.

The generating device 1000 of the data model in this embodiment obtains the first input data by setting the training image set as the input data of the second data model, performs data training on the second data model according to the first data model and the first input data to obtain a trained second data model, and converges the output of the trained second data model according to the second loss function to obtain a third data model, so as to ensure the model precision of the third data model and further ensure the model precision of the fourth data model.

the first processing module 1002 is further configured to set model features of the first data model as second input data of the first auxiliary classifier;

the first processing module 1002 is further configured to determine a third loss function according to the first data model and the second data model;

the first processing module 1002 is further configured to perform data training on the first auxiliary classifier according to the second input data and the training image set, so as to obtain a trained first auxiliary classifier;

the first processing module 1002 is further configured to perform data convergence on the trained first auxiliary classifier according to the third loss function, so as to obtain a second auxiliary classifier.

The generating device 1000 of the data model in this embodiment obtains the second input data by setting the model feature of the first data model as the input data of the first auxiliary classifier, trains and optimizes the first auxiliary classifier according to the second input data and the training image set to obtain the trained first auxiliary classifier, and then converges the output of the trained first auxiliary classifier according to the third loss function, so as to determine the second auxiliary classifier, thereby ensuring the data accuracy of the second auxiliary classifier and further ensuring the model accuracy of the fourth data model.

the first processing module 1002 is further configured to perform data convergence on the output of the third data model by using the second auxiliary classifier, so as to obtain a fourth data model.

The data model generating device 1000 in this embodiment converges the output of the third data model through the second auxiliary classifier to obtain the fourth data model, so as to ensure the model accuracy of the fourth data model.

The first processing module 1002 is further configured to obtain model training data, and perform model training according to the model training data to obtain a first data model.

The data model generating device 1000 in this embodiment performs data training according to model training data, so as to create a first data model, thereby ensuring the model accuracy of the first data model.

In an embodiment according to the present application, as shown in fig. 12, a data model generating apparatus 1200 is provided, where the data model generating apparatus 1200 includes a processor 1202 and a memory 1204, and a program or an instruction is stored in the memory 1204, and the program or the instruction is executed by the processor 1202 to implement the steps of the data model generating method in any of the above-mentioned aspects. Therefore, the data model generating apparatus 1200 has all the advantages of the data model generating method according to any of the above-described aspects, and will not be described in detail herein.

The execution subject of the technical scheme of the image processing method provided by the application can be an image processing device, and can be determined according to actual use requirements, and is not particularly limited herein. In order to more clearly describe the image processing method provided by the present application, an image processing apparatus is used as an execution subject.

In one embodiment according to the present application, as shown in fig. 13, there is provided an image processing method including:

step 1302, acquiring an image set to be detected and an image classification model;

in step 1304, the image set to be detected is classified by the image classification model, so as to obtain a target image set.

In this embodiment, an image processing method is provided, in which an image processing apparatus acquires an image set to be detected and an image classification model, respectively, and performs classification processing on the image set to be detected through the image classification model, to determine a target image set in the image set to be detected. The image to be detected is a data set of the image to be detected, the target image set comprises effective images in the image set to be detected, and the image classification model is a data model determined by the generation method of the data model in any embodiment.

By way of example, the set of images to be detected may comprise detection images of the individual organs of the patient, and the set of target images comprises breast-type images of the patient.

According to the image processing method, the image collection to be detected is classified through the image classification model, the target image collection in the image collection to be detected is determined, the classification accuracy of the image collection to be detected is improved through the image classification model, and the data accuracy of the target image collection is further guaranteed.

In one embodiment according to the application, the set of images to be detected comprises any one of the following: medical field images, industrial equipment images, biological images.

In this embodiment, the image set to be detected may be a medical field image, an industrial device image, or a biological image, where the medical field image is an image of a medical field, the industrial device image is a real-time image of an industrial device, and the biological image is an image containing various living things.

The medical field image may be embodied as a detection image of a patient in a hospital, which is classified into images corresponding to different organs by an image classification model, for example.

For example, the industrial device image may specifically include images of various engineering vehicles, and the industrial device image is classified into images corresponding to different engineering vehicles through an image classification model.

For example, the biological image may be an image of a face, and the biological image is classified into images corresponding to different faces by an image classification model.

The image processing method in the embodiment expands the application range of the image classification model and enriches the application scene of the image classification model by limiting the image to be detected to be an image in the medical field, an image of industrial equipment or a biological image.

In one embodiment according to the present application, as shown in fig. 14, there is provided an image processing apparatus 1400, the image processing apparatus 1400 including:

a second processing module 1402, configured to obtain an image set to be detected and an image classification model;

the second processing module 1402 is further configured to perform classification processing on the image set to be detected through an image classification model, so as to obtain a target image set.

In this embodiment, an image processing apparatus 1400 is provided, where a second processing module 1402 obtains an image set to be detected and an image classification model, respectively, and performs classification processing on the image set to be detected by the image classification model to determine a target image set in the image set to be detected. The image to be detected is a data set of the image to be detected, the target image set comprises effective images in the image set to be detected, and the image classification model is a data model determined by the generation method of the data model in any embodiment.

The image processing apparatus 1400 in this embodiment performs classification processing on the image set to be detected through the image classification model, determines the target image set in the image set to be detected, and improves the classification accuracy of the image set to be detected through the image classification model, thereby ensuring the data accuracy of the target image set.

In an embodiment according to the present application, as shown in fig. 15, an image processing apparatus 1500 is provided, the image processing apparatus 1500 includes a processor 1502 and a memory 1504, and a program or an instruction is stored in the memory 1504, and the program or the instruction implements the steps of the image processing method in any of the above-described aspects when executed by the processor 1502. Therefore, the image processing apparatus 1500 has all the advantages of the image processing method according to any of the above embodiments, and will not be described in detail herein.

In an embodiment according to the present application, there is provided a readable storage medium having stored thereon a program or instructions which, when executed by a processor, implement the method of generating a data model in any of the embodiments described above or the method of processing an image in any of the embodiments described above, thereby having all the advantageous technical effects of the method of generating a data model in any of the embodiments described above or the method of processing an image in any of the embodiments described above.

Among them, readable storage media such as Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, and the like.

In an embodiment according to the present application, there is provided a computer program product comprising computer instructions which, when executed by a processor, implement the method of generating a data model in any of the embodiments described above or the method of processing an image in any of the embodiments described above, thereby having all the advantageous technical effects of the method of generating a data model in any of the embodiments described above or the method of processing an image in any of the embodiments described above.

In an embodiment according to the application, a detection device is provided, comprising the generating means of the data model in any of the embodiments described above, and/or the readable storage medium in any of the embodiments described above, and/or the computer program product in any of the embodiments described above, and thus having the generating means of the data model in any of the embodiments described above, and/or the readable storage medium in any of the embodiments described above, and/or all the advantageous technical effects of the computer program product in any of the embodiments described above.

In one embodiment according to the application, the detection device is any one of the following: an electronic computer tomography device, a nuclear magnetic resonance detection device and an infrared detection device.

In an embodiment of the present application, a chip is provided, where the chip includes a program or instructions, and when the chip runs, the chip is used to implement all the beneficial effects of the method for generating a data model in any of the foregoing technical solutions or the method for processing an image in any of the foregoing technical solutions, which are not described herein again.

It is to be understood that in the claims, specification and drawings of the present application, the term "plurality" means two or more, and unless otherwise explicitly defined, the orientation or positional relationship indicated by the terms "upper", "lower", etc. are based on the orientation or positional relationship shown in the drawings, only for the convenience of describing the present application and making the description process easier, and not for the purpose of indicating or implying that the apparatus or element in question must have the particular orientation described, be constructed and operated in the particular orientation, so that these descriptions should not be construed as limiting the present application; the terms "connected," "mounted," "secured," and the like are to be construed broadly, and may be, for example, a fixed connection between a plurality of objects, a removable connection between a plurality of objects, or an integral connection; the objects may be directly connected to each other or indirectly connected to each other through an intermediate medium. The specific meaning of the terms in the present application can be understood in detail from the above data by those of ordinary skill in the art.

In the claims, specification, and drawings of the present invention, the descriptions of terms "one embodiment," "some embodiments," "particular embodiments," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In the claims, specification and drawings of the present invention, the schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for generating a data model, the method comprising:

acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model;

Performing data training on the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier;

determining a fourth data model according to the second auxiliary classifier and the third data model, wherein the method specifically comprises the following steps: performing data convergence on the output of the third data model through the second auxiliary classifier to obtain the fourth data model;

the training the first auxiliary classifier through the first data model and the training image set to obtain a second auxiliary classifier includes:

setting model features of the first data model as second input data of the first auxiliary classifier, wherein the model features are feature parameters in the first data model;

determining a third loss function according to the first data model and the second data model;

according to the second input data and the training image set, performing data training on the first auxiliary classifier to obtain a trained first auxiliary classifier;

and carrying out data convergence on the trained first auxiliary classifier according to the third loss function so as to obtain the second auxiliary classifier.

2. The method for generating a data model according to claim 1, characterized in that the method for generating a data model further comprises:

acquiring a first image generator, and generating a first image through the first image generator;

determining a first loss function according to the characteristic parameters of the first image and a preset category;

performing data training on the first image generator through the first loss function to obtain a second image generator;

and generating the training image set through the second image generator.

3. The method for generating a data model according to claim 2, wherein the generating, by the first image generator, a first image includes:

randomly generated noise data is acquired and input into the first image generator to obtain the first image.

4. The method for generating a data model according to claim 2, wherein the training the first image generator by the first loss function to obtain a second image generator includes:

according to the first loss function, carrying out data convergence processing on the first image output by the first image generator to obtain a second image;

And under the condition that the information quantity of the second image is larger than a preset threshold value, completing data training of the first image generator to obtain the second image generator, wherein the information quantity of the second image is a parameter for representing the information content of the second image.

5. The method for generating a data model according to claim 2, wherein the first loss function includes a first loss term, a second loss term, and a third loss term, the first loss term L _H The calculation formula of (2) is as follows:

L _H =CE(h(f _T ,θ _h ),t)；

wherein CE is a cross entropy loss function, h is the first auxiliary classifier, f _T θ is a characteristic parameter of the first image _h The weight parameter t is the preset category of the first image;

the second loss term L _bn The calculation formula of (a) is specifically as follows:

；

the third loss term L _adv The calculation formula of (a) is specifically as follows:

；

wherein x is an input image, KL is a divergence, T is the first data model, S is theA second data model, τ, is the temperature of the knowledge distillation, Is the standard value of the image, theta _t θ, as a network parameter of the first data model _s Is a network parameter of the second data model.

6. The method for generating a data model according to claim 2, wherein the generating, by the second image generator, the training image set includes:

storing the image output by the second image generator in a database to obtain an image data set;

and carrying out data sampling on the image data set to obtain the training image set.

7. The method for generating a data model according to claim 1, wherein the training the second data model through the training image set and the first data model to obtain a third data model includes:

setting the training image set as first input data of the second data model;

determining a second loss function according to the first data model and the second data model;

model training is carried out on the second data model according to the first data model and the first input data so as to obtain a trained second data model;

And carrying out data convergence on the trained second data model according to the second loss function to obtain the third data model.

8. The method of generating a data model according to any one of claims 1 to 7, wherein the acquiring a first data model includes:

and obtaining model training data, and performing model training according to the model training data to obtain the first data model.

9. The method according to any one of claims 1 to 7, wherein the first data model is a trained data model, the second data model is an untrained data model, the third data model is a data model obtained by training data of the second data model, and the fourth data model is a data model obtained by performing convergence processing on the third data model.

10. The method of any of claims 1 to 7, wherein the first data model comprises a trained teacher network, the second data model comprises an untrained student network, the third data model comprises a trained student network, and the fourth data model comprises a converged student network.

11. An image processing method, characterized in that the image processing method comprises:

acquiring an image set to be detected and an image classification model;

classifying the image set to be detected by the image classification model to obtain a target image set, wherein the image classification model is a data model determined by the method for generating a data model according to any one of claims 1 to 10.

12. The image processing method according to claim 11, wherein the set of images to be detected includes any one of: medical field images, industrial equipment images, biological images.

13. A data model generation apparatus, characterized in that the data model generation apparatus includes:

the first processing module is used for acquiring a first data model, a second data model, a training image set and a first auxiliary classifier, and performing data training on the second data model through the training image set and the first data model to obtain a third data model;

the first processing module is further configured to perform data training on the first auxiliary classifier through the first data model and the training image set, so as to obtain a second auxiliary classifier;

The first processing module is further configured to determine a fourth data model according to the second auxiliary classifier and the third data model;

the first processing module is further configured to perform data convergence on the output of the third data model through the second auxiliary classifier, so as to obtain the fourth data model;

the first processing module is further configured to set model features of the first data model as second input data of the first auxiliary classifier; wherein the model features are feature parameters in the first data model;

the first processing module is further configured to determine a third loss function according to the first data model and the second data model;

the first processing module is further configured to perform data training on the first auxiliary classifier according to the second input data and the training image set, so as to obtain a trained first auxiliary classifier;

the first processing module is further configured to perform data convergence on the trained first auxiliary classifier according to the third loss function, so as to obtain a second auxiliary classifier.

14. A readable storage medium, characterized in that a program or instructions is stored on the readable storage medium, which when executed by a processor, implements the steps of the data model generation method according to any one of claims 1 to 10 or the image processing method according to claim 11 or 12.

15. A chip comprising a program or instructions for implementing the steps of the data model generation method according to any one of claims 1 to 10 or the image processing method according to claim 11 or 12 when the chip is running.