CN111310918A - Data processing method and device, computer equipment and storage medium - Google Patents

Data processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111310918A
CN111310918A CN202010079087.1A CN202010079087A CN111310918A CN 111310918 A CN111310918 A CN 111310918A CN 202010079087 A CN202010079087 A CN 202010079087A CN 111310918 A CN111310918 A CN 111310918A
Authority
CN
China
Prior art keywords
model
pruning
sample
performance
success rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010079087.1A
Other languages
Chinese (zh)
Other versions
CN111310918B (en
Inventor
余翀
张银锋
张应国
邓巍然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010079087.1A priority Critical patent/CN111310918B/en
Publication of CN111310918A publication Critical patent/CN111310918A/en
Application granted granted Critical
Publication of CN111310918B publication Critical patent/CN111310918B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application discloses a data processing method, a device and a storage medium, wherein the method comprises the following steps: and carrying out pruning treatment on the first model to obtain a pruning model. And training the pruning model to obtain a second model, and obtaining the success rate of the training result of the second model. And if the success rate of the training result is greater than the success rate threshold, predicting the model operation performance corresponding to the second model according to the operation performance of the original model and the model topology structure of the second model. And if the model operation performance corresponding to the second model does not reach the performance threshold value, carrying out pruning processing on the second model again. And if the model operation performance corresponding to the second model reaches the performance threshold, determining that the second model is a target model for performing business processing. By adopting the embodiment of the application, automatic pruning can be realized, so that the efficiency of model pruning can be improved.

Description

Data processing method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, a computer device, and a storage medium.
Background
Various business processes such as image recognition, artificial intelligence games, voice recognition, and the like can be performed through the deep-learning neural network model. Because the neural network model contains a large number of model parameters, the neural network model consumes a large amount of storage space. Therefore, in order to reduce the storage overhead of the neural network model, model pruning needs to be performed on the neural network model.
At present, the neural network model is usually pruned by means of artificial pruning. It can be understood that, when the neural network model is pruned, the model operation performance of the pruned neural network model needs to be manually tested to obtain the neural network model of which the model operation performance can reach the performance threshold. However, in the manual pruning process, it is often necessary to perform multiple pruning processes to obtain a neural network model reaching the performance threshold. When the pruning processing is carried out on the neural network model each time, the model operation performance of the pruned neural network model needs to be tested manually, so that the whole process of the manual pruning processing is time-consuming, and the model pruning efficiency is reduced.
Content of application
The embodiment of the application provides a data processing method, a data processing device, computer equipment and a storage medium, and can improve the efficiency of model pruning.
An embodiment of the present application provides a data processing method, including:
pruning the first model to obtain a pruning model;
training the pruning model to obtain a second model, and acquiring the success rate of the training result of the second model;
if the success rate of the training result is greater than the success rate threshold, predicting the model operation performance corresponding to the second model according to the operation performance of the original model and the model topological structure of the second model;
if the model operation performance corresponding to the second model does not reach the performance threshold value, carrying out pruning treatment on the second model again;
and if the model operation performance corresponding to the second model reaches the performance threshold, determining that the second model is a target model for performing business processing.
Wherein, the pruning treatment of the first model to obtain the pruning model comprises:
acquiring the result influence degree corresponding to the middle channel parameter corresponding to the model parameter in the first model;
determining a pruning channel in the channel parameters based on the resulting influence degree;
and deleting the pruning channel in the first model to obtain a pruning model.
Wherein, the training of the pruning model to obtain a second model and the success rate of the training result of the second model include:
training the pruning model according to the first sample data to obtain a second model;
storing the training result corresponding to the first sample data output by the second model in a statistical database; the statistical database is used for storing the success rate of the training result of the second model; the success rate of the training result is generated according to the sample label corresponding to the first sample data and the statistics of the training result;
and reading the success rate of the training result of the second model from the statistical database.
Wherein, if the success rate of the training result is greater than the success rate threshold, predicting the model operation performance corresponding to the second model according to the original model operation performance and the model topology structure of the second model, including:
if the success rate of the training result is greater than the success rate threshold value, inputting the model topological structure of the second model and the model parameters of the second model into a prediction model associated with the original model; the original model is a model which is associated with the first model and is not subjected to pruning treatment;
predicting a performance ratio corresponding to the second model by the prediction model;
performing performance test on the original model to obtain the running performance of the original model corresponding to the original model;
and determining the model operation performance corresponding to the second model based on the original model performance and the performance ratio.
Wherein, the performing the performance test on the original model to obtain the operation performance of the original model corresponding to the original model includes:
performing performance test on the original model to obtain at least two initial operation performances corresponding to the original model;
carrying out mean value processing on the at least two initial running performances to obtain mean values corresponding to the at least two initial running performances;
and determining the average value as the running performance of the original model.
Wherein, the method also comprises:
if the success rate of the training result is smaller than or equal to the success rate threshold, acquiring a successful pruning model with the latest storage timestamp from a locally stored successful pruning model set; the successful pruning model is the second model with the training result success rate larger than the success rate threshold;
updating the pruning model based on the successful pruning model with the latest stored timestamp.
Wherein, the method also comprises:
and storing the second model with the training result success rate larger than the success rate threshold value in the successful pruning model set, and determining the second model as the successful pruning model with the latest storage time stamp in the successful pruning model set.
Wherein, the method also comprises:
obtaining model parameters of a sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
randomly deleting channel parameters in model parameters in a model topological structure of the sample model to obtain a sample pruning model corresponding to the sample model;
training the sample pruning model according to second sample data to obtain a sample training model;
acquiring an actual performance ratio between the sample training model and the sample model;
predicting a prediction performance ratio corresponding to the sample training model through an initial prediction model;
adjusting the initial prediction model according to the actual performance ratio and the prediction performance ratio;
and when the adjusted initial prediction model meets the convergence condition, determining the adjusted initial prediction model as the prediction model associated with the original model.
Wherein the predicting the prediction performance ratio corresponding to the sample training model by the initial prediction model comprises:
inputting a model topology structure of the sample training model and model parameters of the sample training model to the initial prediction model, and predicting a prediction performance ratio corresponding to the sample training model by the initial prediction model.
Wherein, the method also comprises:
determining a loss value of the adjusted initial prediction model according to the actual performance ratio, the prediction performance ratio output by the adjusted initial prediction model and the regularization loss value of the adjusted initial prediction model;
when the loss value is smaller than the loss function threshold value, acquiring the success rate of the prediction result of the adjusted initial prediction model;
if the success rate of the prediction result of the adjusted initial prediction model is greater than the threshold of the success rate of the prediction, determining that the adjusted initial prediction model meets the convergence condition;
and if the success rate of the prediction result of the adjusted initial prediction model is less than or equal to the prediction success rate threshold, regenerating the sample pruning model to reconstruct the initial prediction model.
Wherein, when the loss value is smaller than the loss function threshold, obtaining the success rate of the prediction result of the adjusted initial prediction model includes:
when the loss value is smaller than a loss threshold function, testing the adjusted initial prediction model based on a sample test model to obtain a sample test result corresponding to the sample test model; the sample test model is obtained by pruning the sample model;
determining a prediction error corresponding to the adjusted initial prediction model based on the sample test result;
determining the sample test result with the prediction error smaller than the error threshold value as a successful sample test result;
and counting the success rate of the prediction result of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
Wherein the determining a prediction error corresponding to the adjusted initial prediction model based on the sample test result includes:
obtaining the test performance ratio in the test result of the sample;
determining the running performance of the test model corresponding to the sample test model according to the test performance ratio in the sample test result and the running performance of the original model;
acquiring actual model operation performance corresponding to the sample test model;
and determining the corresponding prediction error of the adjusted initial prediction model according to the running performance of the test model, the actual running performance of the sample test model and the running performance of the original model.
An embodiment of the present application provides a data processing apparatus, which is integrated in a computer device, and includes:
the pruning module is used for carrying out pruning processing on the first model to obtain a pruning model;
the first training module is used for training the pruning model to obtain a second model and obtain the success rate of the training result of the second model;
the first prediction module is used for predicting the model operation performance corresponding to the second model according to the operation performance of an original model and the model topological structure of the second model if the success rate of the training result is greater than the success rate threshold;
the re-pruning module is used for re-pruning the second model if the running performance of the model corresponding to the second model does not reach the performance threshold;
and the first determining module is used for determining that the second model is a target model for performing business processing if the model operation performance corresponding to the second model reaches the performance threshold.
Wherein, this pruning module includes:
a first obtaining unit, configured to obtain a result influence degree corresponding to a channel parameter in the first model;
a first determining unit, configured to determine a pruning channel in the channel parameters based on the result influence;
and the deleting unit is used for deleting the pruning channel in the first model to obtain a pruning model.
Wherein, this first training module includes:
the training unit is used for training the pruning model according to the first sample data to obtain a second model;
a storage unit, configured to store a training result corresponding to the first sample data output by the second model in a statistical database; the statistical database is used for storing the success rate of the training result of the second model; the success rate of the training result is generated according to the sample label corresponding to the first sample data and the statistics of the training result;
and the reading unit is used for reading the success rate of the training result of the second model from the statistical database.
Wherein the first prediction module comprises:
an input unit, configured to input a model topology structure of the second model and a model parameter of the second model to a prediction model associated with an original model if the success rate of the training result is greater than a success rate threshold; the original model is a model which is associated with the first model and is not subjected to pruning treatment;
a prediction unit configured to predict a performance ratio corresponding to the second model by the prediction model;
a second obtaining unit, configured to perform a performance test on the original model, and obtain an operation performance of the original model corresponding to the original model;
and a second determining unit, configured to determine a model operation performance corresponding to the second model based on the original model performance and the performance ratio.
Wherein the second acquiring unit includes:
the first obtaining subunit is configured to perform a performance test on the original model, and obtain at least two initial operating performances corresponding to the original model;
the average processing subunit is configured to perform average processing on the at least two initial operating performances to obtain an average value corresponding to the at least two initial operating performances;
and the first determining subunit is used for determining the average value as the running performance of the original model.
Wherein, the device still includes:
a first obtaining module, configured to obtain, from a locally stored successful pruning model set, a successful pruning model with a latest storage timestamp if a success rate of the training result is less than or equal to the success rate threshold; the successful pruning model is the second model with the training result success rate larger than the success rate threshold;
and the updating module is used for updating the pruning model based on the successful pruning model with the latest storage time stamp.
Wherein, the device still includes:
and a second determining module, configured to store the second model with the training result success rate greater than the success rate threshold in the successful pruning model set, and determine the second model as a successful pruning model with a latest storage timestamp in the successful pruning model set.
Wherein, the device still includes:
the second acquisition module is used for acquiring model parameters of the sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
a deleting module, configured to randomly delete a channel parameter in the model parameters of the sample model to obtain a sample pruning model corresponding to the sample model;
the second training module is used for training the sample pruning model according to second sample data to obtain a sample training model;
a third obtaining module, configured to obtain an actual performance ratio between the sample training model and the sample model;
the second prediction module is used for predicting the prediction performance ratio corresponding to the sample training model through the initial prediction model;
an adjusting module, configured to adjust the initial prediction model according to the actual performance ratio and the predicted performance ratio;
and a third determining module, configured to determine the adjusted initial prediction model as the prediction model associated with the original model when the adjusted initial prediction model satisfies a convergence condition.
Wherein the second prediction module is further configured to:
inputting a model topology structure of the sample training model and model parameters of the sample training model to the initial prediction model, and predicting a prediction performance ratio corresponding to the sample training model by the initial prediction model.
Wherein, the device still includes:
a fourth determining module, configured to determine a loss value of the adjusted initial prediction model according to the actual performance ratio, the prediction performance ratio output by the adjusted initial prediction model, and the regularized loss value of the adjusted initial prediction model;
a fourth obtaining module, configured to obtain a success rate of a prediction result of the adjusted initial prediction model when the loss value is smaller than a loss function threshold;
a fifth determining module, configured to determine that the adjusted initial prediction model satisfies a convergence condition if a success rate of the prediction result of the adjusted initial prediction model is greater than a prediction success rate threshold;
and the regeneration module is used for regenerating the sample pruning model to reconstruct the initial prediction model if the success rate of the prediction result of the adjusted initial prediction model is less than or equal to the prediction success rate threshold.
Wherein, this fourth acquisition module includes:
a testing unit, configured to test the adjusted initial prediction model based on a sample testing model when the loss value is smaller than a loss threshold function, so as to obtain a sample testing result corresponding to the sample testing model; the sample test model is obtained by pruning the sample model;
a third determining unit, configured to determine a prediction error corresponding to the adjusted initial prediction model based on the sample test result;
a fourth determining unit, configured to determine the sample test result with the prediction error smaller than the error threshold as a successful sample test result;
and the statistical unit is used for counting the success rate of the prediction result of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
Wherein the third determination unit includes:
the second acquisition subunit is used for acquiring the test performance ratio in the sample test result;
a second determining subunit, configured to determine, according to a test performance ratio in the sample test result and the operation performance of the original model, the operation performance of the test model corresponding to the sample test model;
the third obtaining subunit is configured to obtain an actual model operation performance corresponding to the sample test model;
and a third determining subunit, configured to determine, according to the running performance of the test model, the actual running performance of the sample test model, and the running performance of the original model, a prediction error corresponding to the adjusted initial prediction model.
One aspect of the present application provides a computer device, comprising: a processor, a memory, a network interface;
the processor is connected to a memory and a network interface, wherein the network interface is used for providing a data communication function, the memory is used for storing a computer program, and the processor is used for calling the computer program to execute the method in the above aspect in the embodiment of the present application.
An aspect of the present application provides a computer-readable storage medium storing a computer program comprising program instructions that, when executed by a processor, perform the method of the above-mentioned aspect of the embodiments of the present application.
In the embodiment of the application, pruning processing and training can be performed on the first model, so that the second model can be obtained, and the training success rate of the second model can be obtained. Furthermore, according to the embodiment of the application, the model operation performance of the second model can be quickly predicted according to the model topological structure and the original model operation performance of the second model without manually testing the model operation performance of the second model, so that the time consumption of the pruning processing process is short, and the model pruning efficiency is improved. Then, according to the success rate threshold and the performance threshold in the pruning judging condition, the automatic pruning can be performed on the first model, so that a balance point can be quickly found between the success rate threshold and the performance threshold, and the model pruning efficiency can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application;
fig. 2 is a schematic view of a scenario for performing service data interaction according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a data processing method according to an embodiment of the present application;
FIG. 4 is a system block diagram provided in an embodiment of the present application;
FIG. 5 is an architecture diagram of a statistical training result success rate according to an embodiment of the present disclosure;
fig. 6 is a schematic view of a scenario for determining model operation performance corresponding to a second model according to an embodiment of the present disclosure;
FIG. 7 is a schematic flow chart diagram illustrating a method for determining a predictive model according to an embodiment of the present disclosure;
fig. 8 is a scene schematic diagram for recording an actual performance ratio corresponding to a sample training model according to an embodiment of the present disclosure;
FIG. 9 is a flow chart of an automated pruning provided by an embodiment of the present application;
FIG. 10 is a scene diagram of data interaction using an object model according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 12 is a schematic diagram of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Please refer to fig. 1, which is a schematic structural diagram of a network architecture according to an embodiment of the present application. As shown in fig. 1, the network architecture may include a server 2000 and a user terminal cluster, and the user terminal cluster may include a plurality of user terminals, as shown in fig. 1, specifically, a user terminal 3000a, a user terminal 3000b, user terminals 3000c, …, and a user terminal 3000 n.
As shown in fig. 1, the user terminals 3000a, 3000b, 3000c, …, and 3000n may be respectively in network connection with the server 2000, so that each user terminal may interact data with the server 2000 through the network connection.
As shown in fig. 1, each user terminal in the user terminal cluster may be installed with a target application, and when the target application runs in each user terminal, data interaction may be performed between the target application and the server 2000 shown in fig. 1, where the target application may be an application that performs service processing through a target model, and the target model may perform service processing such as image recognition, artificial intelligence game, and voice recognition.
It is understood that the term Artificial Intelligence (AI) refers to a new technical science of using a digital computer or a machine controlled by a data computer (e.g., the server 2000 shown in fig. 1) to simulate, extend and expand human Intelligence. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Machine Learning (ML) is a multi-domain cross subject, and relates to multiple subjects such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like. It is understood that, in the embodiment of the present application, the prediction model for determining the corresponding prediction performance ratio of the model may be obtained through machine learning.
For convenience of understanding, in the embodiment of the present application, one user terminal may be selected from the plurality of user terminals shown in fig. 1 as a target user terminal, and the target user terminal may include: the intelligent terminal comprises an intelligent terminal with a video data processing function, such as a smart phone, a tablet computer and a desktop computer. For example, in the embodiment of the present application, the user terminal 3000a shown in fig. 1 may be used as the target user terminal, and the target application may be integrated in the target user terminal, and at this time, the target user terminal may implement data interaction with the server 2000 through a service data platform corresponding to the target application.
For easy understanding, please refer to fig. 2, which is a schematic view of a scenario for performing service data interaction according to an embodiment of the present application. The computer device in the embodiment of the present application may be an entity terminal having a model optimization function, and the entity terminal may be a server or a user terminal. The embodiment of the present application takes the server 2000 shown in fig. 1 as an example to illustrate a process of performing model optimization on a neural network model by the computer device.
Wherein the raw model processed by the computer device may be an already trained neural network model. It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the present application does not limit the service processing that can be performed by the neural network model. In the embodiment of the present application, a neural network model for performing the business process of image recognition is taken as an example to illustrate a process of performing model optimization on the neural network model by the computer device.
It is understood that the computer device may initially train the neural network model for performing the business process of image recognition, and when the training result success rate of the neural network model is greater than the success rate threshold in the pruning judgment condition as shown in fig. 2, the computer device may determine that the initial training of the neural network model is successful. The success rate of the training result is generated according to the sample label corresponding to the sample data for training the neural network model and the statistics of the training result. When the success rate of the training result is greater than the success rate threshold, it can be understood that the neural network model satisfies the convergence condition, i.e. the training is successful. It should be understood that the neural network model successfully trained for the first time may be referred to as a primitive model in the embodiments of the present application.
As shown in fig. 2, the computer device may determine the original model as the first model that requires pruning. The neural network model which needs pruning processing can be referred to as a first model in the embodiments of the present application. Further, the computer device may perform pruning on the first model to obtain a pruned first model. In this embodiment, a first model obtained after pruning may be referred to as a pruning model. It should be understood that the computer device may train the pruning model a plurality of times according to the first sample data, resulting in a trained pruning model. The pruning model after training can be referred to as a second model in the embodiment of the application. It can be understood that, when the pruning model is trained for the last time, the computer device may store the training result corresponding to the first sample data output by the second model in a statistical database, where the statistical database is used to store the success rate of the training result of the second model. The success rate of the training result is generated according to the sample label corresponding to the first sample data and the statistics of the training result;
it should be appreciated that the computer device may make the decision based on a success rate threshold and a performance threshold in the pruning decision condition when evaluating the second model, as shown in fig. 2. Wherein the success rate threshold and the performance threshold can be set by a user. For example, the user may set the success threshold to be 95% and the performance threshold to be 850.
It will be appreciated that the computer device may read the success rate of the training results of the second model from the statistical database when evaluating the second model. If the success rate of the training result read by the computer device from the statistical database is 93%, in other words, the success rate of the training result does not reach the success rate threshold (i.e. 95%). At this time, the successful pruning model with the latest stored timestamp may be obtained in the computer device among the locally stored set of successful pruning models as shown in fig. 2. Wherein, the successful pruning model is the second model with the success rate of the training result being greater than the success rate threshold. At this point, the computer device may update the pruning model based on the successful pruning model with the latest stored timestamp. In other words, the computer device may train the successful pruning model with the latest stored timestamp.
If the success rate of the training result read by the computer device from the statistical database is 98%, that is, the success rate of the training result reaches the success rate threshold. At this point, the computer device may store the second model in a local set of successful pruning models and determine the second model as the successful pruning model with the latest stored timestamp in the set of successful pruning models.
Further, when the success rate of the training result of the second model is greater than the success rate threshold, the computer device may perform performance prediction on the second model based on the prediction model shown in fig. 2, so as to obtain the model operation performance corresponding to the second model. The model operation performance corresponding to the Second model may be a query rate Per Second (quieries Per Second, abbreviated as QPS) corresponding to the Second model, where the query rate Per Second refers to the number of times that the Second model performs service query within one Second.
It is understood that if the computer device determines that the model operation performance corresponding to the second model is 800, in other words, since the second model may be a model for image recognition, the second model may recognize 800 images within 1 second, that is, the time required for recognizing one image is 1.18ms, it may be understood that the model operation performance corresponding to the second model does not reach the performance threshold (e.g., 850). At this time, the computer device may prune the second model, that is, the computer device may determine the second model as the first model subjected to the pruning processing, so that the first model may be pruned. It can be understood that the first model in the embodiment of the present application may be the original model, or may be a second model whose model operation performance does not reach a performance threshold.
If the computer device determines that the model operating performance corresponding to the second model is 880, that is, the model operating performance corresponding to the second model reaches the performance threshold (e.g., 850). At this time, the computer device may determine the second model as a target model for performing business processes.
It should be appreciated that the target model may be used in a target application for any of the user terminals (e.g., user terminal 3000a) in the user terminal cluster shown in FIG. 1 above. Wherein, the target application can adopt the target model to perform the business processing of image recognition. As shown in fig. 2, the image 10 is input into the target model corresponding to the target application, and the target application can determine that the output result of the image 10 is an animal through the target model. The image 20 is input into the target model corresponding to the target application, and the target application can determine that the output result of the image 20 is a person through the target model.
The specific implementation manner of the computer device performing model optimization on the first model based on the success rate threshold and the performance threshold in the pruning judgment condition may refer to the following embodiments corresponding to fig. 3 to 9.
Further, please refer to fig. 3, which is a flowchart illustrating a data processing method according to an embodiment of the present application. The method can be applied to computer equipment with a model optimization function. As shown in fig. 3, the method may include:
s101, pruning the first model to obtain a pruning model.
Specifically, the computer device may obtain a result influence degree corresponding to each channel parameter in the model parameters in the first model. The first model may be a neural network model for pruning. It should be understood that the computer device may perform sorting (e.g., descending processing or ascending processing) on the channel parameters corresponding to the result influence degree, so as to determine the pruning channel. Further, the computer device may delete the pruning channel from the first model, thereby obtaining a pruning model.
Wherein one of the channel parameters of the model parameter may correspond to one matrix dimension of a weight matrix of the model parameter. The resulting influence of a channel parameter may be determined from the elements in the dimension of the weight matrix to which the channel parameter corresponds. It should be understood that the resulting influence degree may be a gradient value of all matrix elements in the dimension of the corresponding weight matrix, or an average value of all matrix elements in the dimension of the corresponding weight matrix, which is not limited herein. The result influence degree refers to the influence degree of the channel parameter corresponding to the result influence degree on the output result of the first model. It is understood that the greater the influence of the result, the greater the influence of the channel parameter corresponding to the influence of the result on the output result of the first model.
The computer device in the embodiment of the present application may be an entity terminal having a function of model optimization (e.g., model pruning), and the entity terminal may be a server or a user terminal. The embodiment of the present application takes the server 2000 shown in fig. 1 as an example to illustrate a process of performing model optimization on a neural network model by the computer device. It is understood that the neural network model may be a model for image recognition, a model for artificial intelligence game, a model for voice recognition, etc. The embodiment of the present application does not limit the service processing that can be performed by the neural network model.
For easy understanding, please refer to fig. 4, which is a system structure diagram provided in the embodiment of the present application. As shown in fig. 4, the system architecture diagram may include a training layer, an evaluation layer, and a pruning layer.
It should be understood that data transmission can be performed among the training layer, the evaluation layer, and the pruning layer in the computer device, so that automatic pruning is realized, and finally, a target model for performing business processing is output. It is understood that the computer device may import the neural network model into the training layer for training, so that the initially trained neural network model, i.e., the original model (e.g., model a), may be obtained. The embodiment of the application may determine the model a as a first model, and input the model a into the pruning layer for pruning, so as to obtain a pruning model (e.g., model B). It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the present application does not limit the service processing that can be performed by the neural network model.
It can be understood that the model a may include a plurality of model parameters, each of the plurality of model parameters corresponds to a weight matrix, and the number of channel parameters in each model parameter corresponds to the degree of dimension of the weight matrix. For example, the model a may include 3 model parameters, i.e., a model parameter a, a model parameter b, and a model parameter c. The model parameter a may include a 256-dimensional weight matrix a, the model parameter b may include a 256-dimensional weight matrix b, and the model parameter c may include a 128-dimensional weight matrix c. In other words, the model parameter a may contain 256 channel parameters, the model parameter b may contain 256 channel parameters, and the model parameter c may contain 128 channel parameters.
It should be understood that, in the embodiment of the present application, the gradient values of all matrix elements of a certain dimension in a weight matrix may be used as the result influence of the channel parameter corresponding to the dimension. For example, the resulting influence of the first channel parameter in the model parameter a may be the gradient values of all matrix elements in the first dimension of the weight matrix a. The resulting influence of the second channel parameter in the model parameter a may be the gradient values of all matrix elements in the second dimension of the weight matrix a. By analogy, the description will not be continued.
Further, when the computer device performs pruning processing, a pruning channel can be determined in the channel parameters according to the influence degree of the result. It should be understood that the computer device may sort the result influence degrees corresponding to all the channel parameters in the three model parameters, and then delete the K channel parameters with smaller result influence degrees (i.e. reduce the dimension of the corresponding weight matrix). Wherein K may be a positive integer. In order to prevent the computer device from excessively pruning the model when performing the next pruning processing, the number of the channel parameters deleted by the computer device each time performing the pruning processing may be different. As the number of pruning processes increases, the number of channel parameters pruned by the computer device may be sequentially decreased. In other words, the larger the number of times of pruning processing, the smaller the number of channel parameters deleted in the pruning process becomes. For example, when pruning is performed for the first time, the computer device may obtain 8 channel parameters with smaller influence, and delete the 8 channel parameters from the first model. When pruning is performed for the second time, the computer device may obtain 6 channel parameters with smaller influence, and further delete the 6 channel parameters.
Optionally, the computer device may perform descending processing on the result influence degrees respectively corresponding to the channel parameters in each model parameter, and then delete K channel parameters with smaller result influence degrees in each model parameter (that is, reduce the dimension of the corresponding weight matrix). Wherein K may be a positive integer. For example, the model parameter a in the model a may include a 256-dimensional weight matrix a, the model parameter b may include a 256-dimensional weight matrix b, and the model parameter c may include a 128-dimensional weight matrix c. In other words, the model parameter a may contain 256 channel parameters, the model parameter b may contain 256 channel parameters, and the model parameter c may contain 128 channel parameters. When the computer device prunes the model a, 2 channel parameters with smaller influence of the result may be deleted from the model parameter B, and 2 channel parameters with smaller influence of the result may be deleted from the model parameter c, so that a pruning model (e.g., model B) may be obtained. The model parameter a in the model B may include 254 channel parameters, the corresponding weight matrix a is 254 dimensions, the model parameter B may include 254 channel parameters, the corresponding weight matrix B is 254 dimensions, the model parameter c may include 126 channel parameters, and the corresponding weight matrix is 126 dimensions. In addition, when the pruning processing is performed on the model, the number of the channel parameters deleted by the computer device in each model parameter may also be different.
S102, training the pruning model to obtain a second model, and obtaining the success rate of the training result of the second model.
Specifically, the computer device may perform multiple training on the pruning model according to the first sample data to obtain the training pruning model. The pruning model after training can be referred to as a second model in the embodiment of the application. At this time, the computer device may store a training result corresponding to the first sample data output by the second model in a statistical database. The statistical database is used for storing the success rate of the training result of the second model; the success rate of the training result is generated according to the sample label corresponding to the first sample data and the statistics of the training result. Further, the computer device may read a success rate of the training result of the second model from the statistical database.
It should be appreciated that the computer device may train the pruning model a plurality of times based on the first sample data, such that the second model may be derived. It is understood that the computer device may adjust the matrix elements of the weight matrix corresponding to each model parameter in the pruning model. As shown in fig. 4, the computer device may input the model B into the training layer for training to obtain a second model (e.g., model C), and further, the computer device may obtain a success rate of the training result of the model C.
It will be appreciated that the model B is the result of pruning the model a by the computer device. At this time, the computer device may train the model B a plurality of times based on the sample data a (first sample data), so that the trained model B may be obtained. In this embodiment, the pruning model after training may be referred to as a second model (e.g., model C).
It should be understood that the computer device may store the training result corresponding to the sample data a output by the model C to the statistical database. Further, the computer device may read the training result success rate of the model C from the statistical database.
For easy understanding, please refer to fig. 5, which is an architecture diagram of a statistical training result success rate provided by the embodiment of the present application. As shown in FIG. 5, model 300 may be a second model of the success rate of the statistical training results of the computer device at the current time. The original model associated with the model 300 may be the model 100.
It is understood that the model 100 may be derived from a neural network model imported into the computer device after a plurality of training sessions. The neural network model may be a model for performing image recognition. Further, the computer device may train the model 100, such that a success rate of the training results of the model 100 may be determined. Further, the computer device may store the training result success rate (i.e., training result success rate 1) of the model 100 in a statistical database as shown in fig. 5. It is to be appreciated that when the success rate of the training results of the model 100 is greater than the success rate threshold of the pruning decision condition, the computer device may store the model 100 in a local set of successful pruning models and determine the model 100 as the successful pruning model with the latest stored timestamp in the set of successful pruning models. Then, the computer device may determine the model operation performance of the model 100, and may perform pruning processing on the model 100 and perform multiple training on the model 100 after the pruning processing when the model operation performance of the model 100 does not reach the performance threshold of the pruning determination condition, so as to obtain the model 200.
Further, the computer device may evaluate the model 200, so that a success rate of the training result of the model 200 may be obtained. Further, the computer device may store the training result success rate (i.e., training result success rate 2) of the model 200 in a statistical database as shown in fig. 5. It is to be understood that, when the success rate of the training result of the model 200 is greater than the success rate threshold in the pruning decision condition, the computer device may further store the model 200 into a local successful pruning model set, and determine the model 200 as the successful pruning model with the latest stored timestamp in the successful pruning model set. Further, the computer device may determine the model operation performance of the model 200, and when the model operation performance of the model 200 does not reach the performance threshold in the pruning determination condition, the computer device may perform pruning processing on the model 200 and perform multiple training on the pruned model 200, so as to obtain the model 300 as shown in fig. 4.
The computer device in this embodiment of the present application may be the computer device corresponding to fig. 4. It should be understood that the embodiment of the present application takes the model 300 as an example to illustrate the process of the computer device counting the success rate of the training result of the second model. It is understood that the computer device may input the sample data a (i.e. the first sample data) shown in fig. 4 into the model 300, and obtain a training result corresponding to the sample data a. For example, the sample data a may include 100 image data, and specifically may include: image data 1, image data 2, …, and image data 100. The training result corresponding to the image data 1 is a training result 1, the training result corresponding to the image data 2 is a training result 2, and so on.
It should be understood that the training result corresponding to the sample data a output by the model 300 may be output in the form of log file (a data output form). At this time, the distributed storage system (Kafka Cluster, a system for storing data) and the data statistics tool (Logstash, a tool for performing data statistics) shown in fig. 4 may perform statistical analysis on the training result output by the model 300, so as to generate the training result success rate 3 of the model 300.
It is understood that the computer device may match the training result corresponding to the sample data a in the model 300 with the sample label corresponding to the sample data a. For example, if the sample label corresponding to the image data 1 is a person, and the training result obtained by the model 300 for the image data 1 is also a person, the computer device may determine that the training result for the image data 1 is successfully matched with the sample label for the image data 1, and further may add one to the number of times of successful matching to obtain the number of successful matches. In addition, if the sample label corresponding to the image data 2 is a person and the training result obtained by the model 300 for the image data 2 is a plant, the computer device may determine that the training result of the image data 2 fails to match the sample label, and further may add one to the number of matching failures to obtain the number of matching failures.
It should be appreciated that the computer device may determine the training result success rate (i.e., training result success rate 3) of the model 300 based on the ratio between the number of matching successes and the total number of sample data (i.e., the sum of the number of matching successes and the number of matching failures). For example, the computer device may input 100 image data in the sample data a into the model 300, so as to obtain a training result corresponding to each image data of the 100 image data. Further, the computer device performs statistical analysis on the label information of each image data and the corresponding training result based on the distributed storage system and the data statistical tool, so that it can be determined that the number of successful matching is 98, and then the computer device can determine that the success rate 3 of the training result of the model 300 is 98%.
At this time, the computer device may store the training result success rate 3 of the model 300 into the statistical database inventory. Wherein the training result success rate 3 has the latest stored timestamp. Further, when reading the success rate of the training result of the model 300, the computer device may send a service request for querying the success rate of the training result of the model 300 to the statistical database through the search client shown in fig. 5. Then, the statistical database may obtain the training result success rate 3 with the latest timestamp based on the service request, and return the training result success rate 3 to the search client. At this time, the success rate of the training result obtained by the computer device to the model 300 may be a success rate of the training result of 3.
And S103, if the success rate of the training result is greater than the success rate threshold, predicting the model operation performance corresponding to the second model according to the operation performance of the original model and the model topology structure of the second model.
Specifically, if the success rate of the training result is greater than the success rate threshold, the computer device may input the model topology of the second model and the model parameters of the second model into the prediction model associated with the original model. The original model may be a model associated with the first model and not subjected to pruning. At this time, the computer device may predict a performance ratio corresponding to the second model through the prediction model. Further, the computer device may perform a performance test on the original model, so as to obtain an original model operation performance corresponding to the original model. The computer device may then determine a model operating performance corresponding to the second model based on the original model performance and the performance ratio.
It should be appreciated that the computer device may input the model C into the evaluation layer to evaluate the success rate of the training results of the model C, as shown in fig. 4. If the success rate of the training result of the model C (second model) is greater than the success rate threshold (e.g., 95%) in the pruning determination condition, the computer device may store the model C in the local successful pruning model set. Further, the computer device may determine the model C as the successful pruning model in the set of successful pruning models having the latest stored timestamp. For example, if the success rate of the training result of the model C is 98%, the computer device may store the model C into a local successful pruning model set, and determine the model C as the successful pruning model with the latest storage timestamp in the successful pruning model set.
In addition, the computer device may evaluate the model performance of the model C in the evaluation layer. The model operation performance may be a query rate per second (i.e., QPS) used to characterize the number of times that the model makes a service query in one second. For easy understanding, please refer to fig. 6, which is a schematic view of a scenario for determining the model operation performance corresponding to the second model according to an embodiment of the present application. The original model in this embodiment may be the model a in the embodiment corresponding to fig. 4, and the second model may be the model C in the embodiment corresponding to fig. 4.
It will be appreciated that the computer device may obtain raw model operating performance corresponding to the raw model. The computer equipment can perform multiple performance tests on the original model in order to reduce errors of the computer equipment caused by the reasons of temperature, occupancy rate and the like, so that the running performance of the original model corresponding to the original model can be obtained more accurately. It should be appreciated that the computer device may perform a performance test on the original model such that at least two initial operating performances corresponding to the original model may be obtained. Further, the computer device may perform an average processing on the at least two initial operating performances to obtain an average value corresponding to the at least two initial operating performances, and may use the average value as the original model operating performance corresponding to the original model.
For example, the computer device may perform performance testing on the model a, such that at least two (e.g., 5) initial operating performances may be obtained. For example, model a may have an initial operating performance of 765 for 1, 762 for 2, 775 for 3, 783 for 4, and 765 for 5. Further, the computer device may perform an averaging process on the 5 initial operating performances to obtain an average value (i.e., 770) corresponding to the 5 initial operating performances. At this time, the computer device may take the average value as the model operation performance (i.e., the original model operation performance) corresponding to the model a. It will be appreciated that the computer apparatus may be arranged to average at least two initial performance runs to obtain an average which is non-integer. At this time, the computer device may approximate (e.g., round) the non-integer average value to obtain an approximate average value, and may take the approximate average value as the original model operation performance corresponding to the original model.
Further, the computer device may obtain a model topology of a second model (e.g., model C) and model parameters of the model C. Further, the computer device can input the model topology of the model C and the model parameters of the model C to a prediction model associated with an original model (e.g., model a), and can predict a performance ratio corresponding to the model C through the prediction model.
The prediction model can be used for predicting the performance ratio between the model operation performance of the second model and the original model operation performance corresponding to the original model. The model topology of the sample model in the prediction model is the same as the model topology of the original model. It is to be understood that the original model may be a model associated with the first model and not subjected to pruning, and the original model in the embodiment of the present application may be a model a (i.e., a first model) as shown in fig. 4.
For example, the model topology of model C may include a first sub-topology and a second sub-topology. Wherein the first sub-topology may be a convolutional neural network, namely cnn [256, 256, 128, 128 ]; the second sub-topology may be a deep neural network, i.e., dnn [256, 256, 128, 128 ]. At this time, the computer device may determine 8 model parameters according to the model topology of the model C, and further, may input the 8 model parameters to a prediction model associated with the original model (e.g., model a), so that the performance ratio corresponding to the model C may be predicted by the prediction model. It is to be understood that the performance ratio corresponding to the model C refers to the model operating performance corresponding to the model C and the original model operating performance corresponding to the original model (e.g., model a).
Further, the computer device may determine a model operation performance corresponding to the model C based on the performance ratio corresponding to the model C and the original model operation performance corresponding to the original model (model a). Specifically, the computer device determines the model operation performance corresponding to the second model as shown in the following formula (1):
peri=perf-ratio*perori, (1)
wherein periThe perf-ratio is the ratio of the performance corresponding to the second model, peroriAnd the running performance of the original model corresponding to the original model is obtained.
For example, if the computer device predicts that the performance ratio corresponding to the model C may be 1.2 through the prediction model, and the original model operation performance corresponding to the original model may be 770, the computer device determines that the model operation performance corresponding to the model C according to the above formula (1) is 924. It is understood that the number of business queries that model C can make in a second is 924. In other words, if the model C is a model for image recognition, the model C can recognize 924 images in one second and obtain recognition results corresponding to the 924 images.
Alternatively, as shown in fig. 4, if the success rate of the training result of the model C (the second model) is less than or equal to the success rate threshold (e.g., 95%) in the pruning determination condition, the computer device may obtain a successful pruning model with the latest timestamp from a set of successful pruning models stored locally. Further, the computer device may update the pruning model based on the successful pruning model with the latest stored timestamp, i.e. may train the successful pruning model with the latest stored timestamp again and determine the successful pruning model with the latest timestamp after the training is completed as the second model.
For example, in the embodiment corresponding to fig. 5, if the computer device determines that the success rate of the training result of the model 300 is 93%, the computer device may obtain a successful pruning model (e.g., the model 200) with the latest stored timestamp from the locally stored successful pruning models. Further, the computer device may update the model 200 to the pruning model, retrain the model 200 to obtain a retrained model 200, and determine the retrained model 200 as the second model.
And S104, if the running performance of the model corresponding to the second model does not reach the performance threshold, carrying out pruning processing on the second model again.
Specifically, if the model operation performance corresponding to the second model does not reach the performance threshold in the pruning determination condition, the computer device may perform pruning processing on the second model. In other words, the second model whose model operation performance does not reach the performance threshold may be updated to the first model, and the updated first model may be pruned to obtain the pruned model after pruning.
It should be understood that, as shown in fig. 4, if the model operation performance corresponding to the model C obtained by the computer device through the above-mentioned prediction model is 840, the computer device may determine that the model operation performance corresponding to the model C does not reach the performance threshold in the pruning determination condition (e.g., 850. at this time, the computer device may determine the model C as the first model (e.g., model a 1). further, the computer device may perform the re-pruning process on the model a1 to obtain a pruning model after the pruning process (e.g., model B1). then, the computer device may perform a plurality of training on the model B1 and determine the trained model B1 as the second model (e.g., model C1).
And S105, if the model operation performance corresponding to the second model reaches the performance threshold, determining that the second model is a target model for performing business processing.
Specifically, if the model operation performance corresponding to the second model obtained by the computer device through the prediction model reaches the performance threshold in the pruning determination condition, the computer device may determine the second model as a target model for performing business processing. The target model may be applied to any one of the user terminals in the user terminal cluster corresponding to fig. 1, for example, the user terminal 3000 a.
It should be understood that, as shown in fig. 4, if the model operating performance corresponding to the model C predicted by the computer device through the prediction model is 900, the computer device may determine that the model operating performance corresponding to the model C reaches a performance threshold (e.g., 850) in the pruning determination condition. At this point, the computer device may determine the model C as an object model for conducting business processes and export the object model to the computer device. It will be appreciated that the computer device may transmit the model C to any one of the user terminals in the cluster of user terminals as shown in fig. 1, depending on the network connectivity.
In the embodiment of the application, the computer device can perform pruning processing and training on the first model, so that the second model can be obtained, and the training success rate of the second model can be obtained. Furthermore, the computer equipment does not need to manually test the model operation performance of the second model, can predict the corresponding performance ratio of the second model according to the prediction model, and can quickly obtain the model operation performance of the second model according to the performance ratio and the original model operation performance, so that the time consumption of the pruning processing process is short. Then, the computer device can automatically prune the first model according to the success rate threshold and the performance threshold in the pruning judging condition so as to quickly find a balance point between the success rate threshold and the performance threshold, thereby improving the pruning efficiency of the model.
Further, please refer to fig. 7, which is a flowchart illustrating a method for determining a prediction model according to an embodiment of the present application. As shown in fig. 7, the method may include:
s201, obtaining model parameters of the sample model.
In particular, the computer device may obtain model parameters of the sample model. It will be appreciated that the model parameters of the sample model have the same model topology as the model parameters of the original model described above. The original model may be the original model in the embodiment corresponding to fig. 3, and the original model may be the model associated with the first model and not subjected to pruning.
The computer device in the embodiment of the present application may be an entity terminal having a function of model optimization (e.g., model pruning), and the entity terminal may be a server or a user terminal. The embodiment of the present application takes the server 2000 shown in fig. 1 as an example to illustrate a process of performing model optimization on a neural network model by the computer device. It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the present application does not limit the service processing that can be performed by the neural network model.
For easy understanding, please refer to fig. 8, which is a scene diagram illustrating a recording of an actual performance ratio corresponding to a sample training model according to an embodiment of the present application. The sample model 10 in the embodiment of the present application may be associated with the original model (e.g., model a) in the embodiment corresponding to fig. 4.
As shown in fig. 8, in order to more accurately predict the model operation performance corresponding to the second model associated with the original model, the computer device may obtain the model topology of the original model (i.e., model a) in the embodiment corresponding to fig. 4. The model topology structure may include model parameters a, b, and c. Further, the computer device may create a set of neural network models having the same model topology based on the model topology of model a. In the embodiment of the present application, a neural network model newly built by a computer device according to a model topology structure of an original model may be referred to as a sample model. At this time, the computer device may acquire model parameters of the sample model 10 to perform pruning processing on the sample model.
S202, randomly deleting the channel parameters in the model parameters of the sample model to obtain a sample pruning model corresponding to the sample model.
Specifically, the computer device may randomly delete the channel parameter in the model topology structure of the sample model according to the model topology structure of the sample model, and may further obtain a sample pruning model corresponding to the sample model.
It should be understood that, as shown in fig. 8, in order to obtain a more general sample pruning model 20, the computer device may randomly delete the channel parameters (i.e., the dimensions of the weight matrix corresponding to the sample model) in the model parameters in the model topology of the sample model 10. The random deletion process of the model topology of the sample model 10 is a process for pruning in order to simulate the original model. It will be appreciated that in simulating the pruning process, the model parameters of the sample model 10 may be randomly initialized to obtain a large number of sample pruning models 20, so that a prediction model that can more accurately predict the model's operating performance may be obtained.
And S203, training the sample pruning model according to the second sample data to obtain a sample training model.
Specifically, the computer device may perform multiple times of training on the sample pruning model according to the second sample data, and may determine the trained sample pruning model as the sample training model.
It should be understood that, as shown in fig. 8, the computer device may refer to the sample data used to train the sample pruning model 20 as the second sample data. The original model may be a neural network model for performing an image recognition process. Therefore, the sample data may be a plurality of image data (for example, 100 pieces of image data) for training the sample pruning model 20, so that the sample training model 30 can be obtained after the training is completed. The specific implementation manner of training the sample pruning model 20 may refer to the training of the pruning model in step S102 in the embodiment corresponding to fig. 3, so as to obtain the description of the second model.
S204, acquiring the actual performance ratio between the sample training model and the sample model.
Specifically, the computer device may perform multiple performance tests on the sample training model to obtain at least two initial model operating performances corresponding to the sample training model. Further, the computer device may perform an average processing on the at least two initial model operation performances corresponding to the sample training model to obtain an average value of the at least two initial model operation performances corresponding to the sample training model, and use the average value as the model operation performance corresponding to the sample training model. The computer device may then determine an actual performance ratio between the sample training model and the sample model based on the model operating performance of the sample model (i.e., the original model operating performance to which the original model corresponds) and the model operating performance to which the sample training model corresponds.
It should be appreciated that the computer device may perform multiple performance tests on the sample training model 30, as shown in FIG. 8, such that at least two (e.g., 5) initial model operating performances of the sample training model 30 may be obtained. Further, the computer device may perform an average processing on the 5 initial model operation performances corresponding to the sample training model 30 to obtain an average value (e.g., 780) of at least two initial model operation performances corresponding to the sample training model 30, and use the average value as the model operation performance corresponding to the sample training model 30. The computer device may then determine that the actual performance ratio between the sample training model 30 and the sample model 10 is 1.013 based on the model run performance (original model run performance, e.g., 770) of the sample model 10 (i.e., model a) and the corresponding model run performance of the sample training model. Further, the computer device may record a mapping relationship between the actual performance ratio and the model parameters of the sample training model in a record table as shown in fig. 8.
It is understood that the computer device may modify the model parameters in the model topology of the sample model 10 a plurality of times to obtain a corresponding plurality of sample pruning models 20, and train the plurality of sample pruning models 20 to obtain a corresponding plurality of sample training models 30. Further, the computer device may record mapping relationships between model parameters in the model topology of the plurality of sample training models 30 and actual performance ratios of the plurality of sample training models 30 in a record table, so as to serve as training sample data of the initial prediction model, thereby obtaining a prediction model for predicting a performance ratio corresponding to the second model.
And S205, predicting the prediction performance ratio corresponding to the sample training model through the initial prediction model.
Specifically, the computer device may input the model topology of the sample training model and the model parameters of the sample training model into an initial prediction model, and predict a prediction performance ratio corresponding to the sample training model by the initial prediction model.
It should be appreciated that the computer device determining the predicted performance ratio for the sample training model input to the initial predictive model may be as shown in equation (2) below:
Figure BDA0002379629450000251
wherein y is the prediction performance ratio of the sample training model of the initial prediction model, xi-1Representing fully connected inputs at layer i as they propagate forward, where x0Representing the input values of the initial prediction model. w is aiRepresenting model parameters of the ith layer in the initial prediction model.
It should be understood that in the embodiment of the present application, an initial prediction model may be constructed according to multiple fully-connected layers to obtain a prediction model. The prediction model is actually a design of a nonlinear regression model between model parameters in a model topology of a sample training model input to the initial prediction model and a performance ratio corresponding to the sample training model.
It is to be understood that the computer device may input the model topology of the sample training model 30 and the model parameters of the sample training model 30 as shown in fig. 8 into the initial prediction model, by which the predicted performance ratio corresponding to the sample training model 30 may be predicted.
And S206, adjusting the initial prediction model according to the actual performance ratio and the prediction performance ratio.
Specifically, the computer device may train the model parameters of the initial prediction model (i.e. w in the above formula (2)) according to the actual performance ratio and the predicted performance ratio of the model based on the above samplesi) And adjusting to obtain an adjusted initial prediction model.
It should be understood that the computer device can train the actual performance ratio and the predicted performance ratio of the model 30 through the above-mentioned samples shown in FIG. 8, and compare the model parameters of the initial prediction model (i.e., w in the above-mentioned formula (2))i) And adjusting to obtain an adjusted initial prediction model.
And S208, when the adjusted initial prediction model meets the convergence condition, determining the adjusted initial prediction model as the prediction model related to the original model.
Specifically, the computer device determines a loss value of the adjusted initial prediction model according to an actual performance ratio, a prediction performance ratio of the adjusted initial prediction model, and a regularization loss value of the adjusted initial prediction model. When the loss value is less than the loss function threshold, the computer device may obtain a prediction result success rate of the adjusted initial prediction model. When the prediction result success rate of the adjusted initial prediction model is greater than the prediction result success rate threshold, the computer device may determine the adjusted initial prediction model as the prediction model associated with the original model in the embodiment corresponding to fig. 3.
It should be appreciated that since the initial prediction model is a non-linear regression model, the most common loss function for this computer device is the sum of squares formula. Specifically, the loss function of the initial prediction model can be shown in the following formula (3):
Figure BDA0002379629450000261
wherein the loss function of this equation (3) consists of two parts. Y in a moietyiRepresents the predicted performance ratio, y, of the ith sample training model predicted by the adjusted initial prediction model-iAnd representing the actual performance ratio corresponding to the ith sample training model. lossiIs the regularization loss value of the adjusted initial prediction model. The regularization loss values can effectively prevent the condition of parameter overfitting in the adjusted initial prediction model.
Wherein, it is understood that the computer device may input the sample training model into the adjusted initial prediction model, so as to output an adjusted prediction performance ratio corresponding to the sample training model. The computer device may then obtain a regularization loss value for the adjusted initial prediction model. Further, the computer device may determine the loss value of the adjusted initial prediction model according to the actual performance ratio of the sample training model, the adjusted prediction performance ratio of the sample training model, and the regularized loss value of the adjusted initial prediction model by the above formula (3).
When the loss value of the adjusted initial prediction model is greater than or equal to the loss function threshold, the computer device may adjust the model parameter of the adjusted initial prediction model (i.e. w in the above formula (2))i) So as to obtain a prediction model capable of accurately predicting the operation performance of the model.
When the loss value of the adjusted initial prediction model is smaller than the loss function threshold, the computer device may obtain a success rate of the prediction result of the adjusted initial prediction model, and when the success rate of the prediction result of the adjusted initial prediction model is greater than the prediction threshold, the computer device may determine the adjusted initial prediction model as the prediction model associated with the original model.
It will be appreciated that the computer device may determine the prediction result success rate of the adjusted initial prediction model by determining a prediction error of the adjusted initial prediction model. It can be understood that the computer device may perform pruning processing on the sample model, that is, randomly delete the model parameters in the model topology structure of the sample model, so as to obtain the sample test model. Further, the computer device may test the adjusted initial prediction model based on the sample test model, so as to obtain a sample test result corresponding to the sample test model. Further, the computer device may determine a prediction error of the adjusted initial prediction model based on the sample test results.
The computer device can obtain the test performance ratio in the sample test result, and determine the test model operation performance corresponding to the sample test model according to the test performance ratio in the sample test result and the original model operation performance by the formula (1). Further, the computer device can perform performance testing on the sample test model to obtain the actual model operation performance corresponding to the sample test model. Further, the computer device may determine a prediction error corresponding to the adjusted initial prediction model according to the test model operation performance, the actual model operation performance corresponding to the sample test model, and the original model operation performance. Specifically, the computer device determines the prediction error as shown in the following equation (4):
Figure BDA0002379629450000271
wherein perinfeRepresenting the test model operating performance of the sample test model peractRepresenting the actual model operating performance, per, of the sample test modeloriThe operation performance of the original model is represented, and dev represents the corresponding prediction error of the sample test model.
Further, the computer apparatus may determine a sample test result corresponding to the sample test model having the prediction error greater than or equal to an error threshold (e.g., 0.05) as a failed sample test result, and count the number of the failed sample test results. For example, the running performance of the test model determined by the sample test model a through the adjusted initial prediction model is 780, and the computer device performs a performance test on the sample test model a to obtain an actual running performance of the test model is 850. Wherein the original model operating performance is 770. At this time, the computer device may determine that the prediction error determined according to the above formula (4) is 0.09, in other words, the prediction error corresponding to the sample test model a is greater than the error threshold, and the computer device may consider the sample test result corresponding to the sample test model a as a failed sample test result.
Further, the computer device may determine a sample test result corresponding to the sample test model having the prediction error smaller than an error threshold (e.g., 0.05) as a successful sample test result, and count the number of successful sample test results. For example, the running performance of the test model determined by the sample test model B through the adjusted initial prediction model is 880, and the running performance of the actual model obtained by the computer device performing the performance test on the sample test model a is 860. Wherein the original model operating performance is 770. At this time, the prediction error determined by the computer device according to the above formula (4) is 0.026, in other words, the prediction error corresponding to the sample test model a is smaller than the error threshold, and the computer device may consider the sample test result corresponding to the sample test model a as a successful sample test result.
Further, the computer device may determine a ratio between the total number of the sample test results (i.e., the sum of the number of successful sample test results and the number of failed sample test results) and the number of successful sample test results according to the total number of the sample test results and the number of successful sample test results, and determine the ratio as the prediction result success rate of the adjusted initial prediction model.
If the success rate of the prediction result of the adjusted initial prediction model is less than or equal to the prediction success rate threshold, the computer equipment can perform pruning processing on the sample model again to generate a sample pruning model so as to reconstruct the initial prediction model.
If the success rate of the prediction result of the adjusted initial prediction model is greater than the prediction success rate threshold, the computer device may determine that the adjusted initial prediction model satisfies the convergence condition, and determine the adjusted initial prediction model satisfying the convergence condition as the prediction model associated with the original model.
It can be understood that, in the embodiment of the present application, performance tests can be performed on multiple sets of test data (for example, 15 sets of test models with different pruning degrees) respectively through a prediction model and a manual performance test, so that tested differences between the two performance test modes can be visually compared. The test model can be obtained by pruning and training an original model.
It should be understood that the computer device may perform manual performance tests on the 15 sets of test models respectively, so as to obtain model operation performances corresponding to the 15 sets of test models respectively, and further, the computer device may obtain actual performance ratios corresponding to the 15 sets of test models respectively according to the original model operation performances corresponding to the original model by using the above formula (1).
Further, the computer device may input the 15 sets of test models into the prediction models, respectively, so as to directly obtain the predicted performance ratios corresponding to the 15 sets of test models, respectively. The computer equipment can count the actual performance ratio and the predicted performance ratio corresponding to each test model in the 15 groups of test models, so that an effect comparison graph of the predicted performance ratio and the actual performance ratio obtained by the 15 groups of test models according to the two performance test modes can be obtained, and the difference between the two performance test modes can be visually seen.
For example, the predicted performance ratio of test model a through the performance model may be 1.47, and the actual performance ratio of test model a through the manual performance test may be 1.50. It is understood that if the original model operation performance corresponding to the test model a is 770, the actual model operation performance according to the test model a may be 1150, and the predicted model operation performance corresponding to the test model a may be 1131. In other words, the actual time that the test model a determines to identify an image may be 0.86ms, and the predicted time that the test model a determines to identify an image may be 0.88 ms. Therefore, the test model A has no large error between the prediction performance ratio obtained by the prediction model and the actual performance ratio obtained by testing through the manual performance test, but the manual performance test consumes long time, so that the prediction model in the embodiment of the application is used for replacing the manual performance test, and the efficiency of the operation performance of the prediction model can be improved.
In the embodiment of the application, in order to construct a prediction model for more accurately predicting the performance ratio corresponding to the second model, the computer device may acquire a model topology structure of the original model associated with the second model to obtain sample data serving as the initial prediction model, and may further construct the prediction model for predicting the performance ratio of the model, so that the model operation performance corresponding to the second model may be quickly determined.
Further, please refer to fig. 9, which is a flowchart illustrating an automatic pruning method according to an embodiment of the present application. As shown in fig. 9, the method may be applied to a computer device having a model optimization function. As shown in fig. 9, the method may include:
s301, importing the neural network model into computer equipment, and training the neural network model to obtain an original model.
S302, determining the original model as a first model, and pruning the first model to obtain a pruning model.
S303, training the pruning model to obtain a second model.
S304, the success rate of the training result of the second model is obtained and matched with the success rate threshold value in the pruning judging condition.
S305, if the success rate of the training result does not reach the success rate threshold, acquiring a successful pruning model with the latest storage timestamp from the successful pruning model set stored locally, so that the computer equipment trains the successful pruning model with the latest storage timestamp again.
And S306, if the success rate of the training result reaches the success rate threshold, storing the second model to a successful pruning model set, and updating the successful pruning model with the latest storage timestamp.
S307, predicting the model operation performance corresponding to the second model, matching the model operation performance with the performance threshold value in the pruning judging condition, and if the model operation performance does not reach the performance threshold value, carrying out re-pruning treatment on the second model.
And S308, if the running performance of the model reaches the performance threshold, taking the second model as a target model to export the computer equipment.
The computer device in the embodiment of the present application may be an entity terminal having a function of model optimization (e.g., model pruning), and the entity terminal may be a server or a user terminal. The embodiment of the present application takes the server 2000 shown in fig. 1 as an example to illustrate a process of performing model optimization on a neural network model by the computer device. It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the present application does not limit the service processing that can be performed by the neural network model.
For a specific implementation of the process of performing automatic pruning in this embodiment of the application, reference may be made to the descriptions of steps S101 to S105 in the embodiment corresponding to fig. 3 and steps S201 to S207 in the embodiment corresponding to fig. 7, and details will not be further described here.
Further, please refer to fig. 10, which is a scene diagram of data interaction using a target model according to an embodiment of the present application. The computer device in the embodiment of the present application may be an entity terminal having a model optimization function, and the entity terminal may be the server 2000 shown in fig. 1. The original model in the embodiment of the present application may be a model for playing an artificial intelligence game.
The computer device may use the original model as a first model, and may further perform pruning on the model, so as to obtain a pruning model. Further, the computer device may train the pruning model such that a second model may be obtained. At this time, the computer device may automatically prune the second model based on the success rate threshold and the performance threshold in the pruning determination condition, so that the target model output to the computer device may be obtained. It should be understood that, for a specific implementation manner of the computer device performing the automatic pruning processing on the original model, reference may be made to the description of step S101 to step S105 in the embodiment corresponding to fig. 3, and details will not be further described here.
It will be appreciated that the computer device may transmit the object model to a user terminal a having a network connection with the computer device. The ue a may be any one of the ues in the ue cluster shown in fig. 1, for example, the ue 3000 a. It will be appreciated that the target model may be used in a target application for the user terminal a. Wherein, the target application can adopt the target model to perform the business process of artificial intelligence game.
It will be appreciated that the object model may be applied in a scenario where a game user is in engagement with a robot for game scenario A. As shown in fig. 10, the display interface 100 of the user terminal a may include playing cards held by the game player corresponding to the user terminal a and playing cards held by the robot with which the game player is playing. The robot can call the target model according to the card playing of the game user, so that the corresponding card playing can be calculated quickly.
For example, the game user selects a "3" of cards at a time, at which point the robot may enter the "3" into the target model, which may calculate the card play corresponding to the "3", e.g., a "5", in the held playing cards. Further, the target model may output the one "5" to the display interface 100, in other words, the robot may play the one "5".
Optionally, the game user selects two "6" cards at a time, at which point the robot may enter the two "6" cards into the target model, which may calculate the cards, e.g., two "8" cards, corresponding to the two "6" cards in the held playing cards. Further, the goal model may output two "8" to the display interface 100, in other words, the robot may play two "8" cards.
It can be understood that the target model obtained by the computer device is model-optimized for the original model, and compared with the original model, the storage space and the memory bandwidth of the model are reduced, so that the target model can be more easily applied to the terminal device, and the target model can be run locally without game AI processing through a background server. It should be understood that, since the terminal device can perform game AI processing through the target model, network communication is not required, and therefore, the embodiment of the present application can reduce the occurrence of game running stutter caused by poor network, and further can improve the speed of game running.
In the embodiment of the application, the computer device can perform pruning processing and training on the first model, so that the second model can be obtained, and the training success rate of the second model can be obtained. Furthermore, the computer equipment does not need to manually test the model operation performance of the second model, can predict the corresponding performance ratio of the second model according to the prediction model, and can quickly obtain the model operation performance of the second model according to the performance ratio and the original model operation performance, so that the time consumption of the pruning processing process is short. Then, the computer device can automatically prune the first model according to the success rate threshold and the performance threshold in the pruning judging condition so as to quickly find a balance point between the success rate threshold and the performance threshold, thereby improving the pruning efficiency of the model.
Further, please refer to fig. 11, which is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing means may be a computer program (comprising program code) running on a computer device, e.g. an application software; the data processing device can be used for executing the corresponding steps in the method provided by the embodiment of the application. As shown in fig. 11, the data processing apparatus 1 may include: the pruning module 10, the first training module 11, the first prediction module 12, the re-pruning module 13, the first determination module 14, the first obtaining module 15, the updating module 16, the second determination module 17, the second obtaining module 18, the deleting module 19, the second training module 20, the third obtaining module 21, the second prediction module 22, the adjusting module 23, the third determination module 24, the fourth determination module 25, the fourth obtaining module 26, the fifth determination module 27, and the regeneration module 28.
The pruning module 10 is configured to perform pruning processing on the first model to obtain a pruning model.
Wherein, this pruning module 10 includes: a first acquisition unit 101, a first determination unit 102, and a deletion unit 103.
The first obtaining unit 101 is configured to obtain a result influence degree corresponding to a channel parameter in the first model;
the first determining unit 102 is configured to determine a pruning channel in the channel parameters based on the result influence;
the deleting unit 103 is configured to delete the pruning channel in the first model to obtain a pruning model.
For specific implementation manners of the first obtaining unit 101, the first determining unit 102, and the deleting unit 103, reference may be made to the description of step S101 in the embodiment corresponding to fig. 3, and details will not be further described here.
The first training module 11 is configured to train the pruning model to obtain a second model, and obtain a success rate of a training result of the second model.
Wherein, this first training module 11 includes: a training unit 111, a storage unit 112 and a reading unit 113.
The training unit 111 is configured to train the pruning model according to the first sample data to obtain a second model.
The storage unit 112 is configured to store the training result corresponding to the first sample data output by the second model in a statistical database; the statistical database is used for storing the success rate of the training result of the second model; the success rate of the training result is generated according to the sample label corresponding to the first sample data and the statistics of the training result;
the reading unit 113 is configured to read a success rate of the training result of the second model from the statistical database.
For specific implementation manners of the training unit 111, the storage unit 112, and the reading unit 113, reference may be made to the description of step S102 in the embodiment corresponding to fig. 3, and details will not be further described here.
The first prediction module 12 is configured to, if the success rate of the training result is greater than a success rate threshold, predict a model operation performance corresponding to the second model according to an original model operation performance and a model topology structure of the second model.
Wherein the first prediction module 12 comprises: an input unit 121, a prediction unit 122, a second acquisition unit 123, and a second determination unit 124.
The input unit 121 is configured to input the model topology of the second model and the model parameters of the second model to a prediction model associated with an original model if the success rate of the training result is greater than a success rate threshold; the original model is a model which is associated with the first model and is not subjected to pruning treatment;
the prediction unit 122 is configured to predict a performance ratio corresponding to the second model using the prediction model;
the second obtaining unit 123 is configured to perform a performance test on the original model, and obtain the running performance of the original model corresponding to the original model.
Wherein, the second obtaining unit 123 includes: a first obtaining sub-unit 1231, an averaging sub-unit 1232, and a first determining sub-unit 1233.
The first obtaining subunit 1231 is configured to perform a performance test on the original model, and obtain at least two initial operating performances corresponding to the original model;
the average processing subunit 1232 is configured to perform average processing on the at least two initial operating performances to obtain an average value corresponding to the at least two initial operating performances;
the first determining subunit 1233 is configured to determine the average value as the original model operation performance.
For a specific implementation manner of the first obtaining sub-unit 1231, the average processing sub-unit 1232, and the first determining sub-unit 1233, reference may be made to the description of obtaining the operation performance of the original model in the embodiment corresponding to fig. 3, and details will not be further described here.
The second determining unit 124 is configured to determine a model operation performance corresponding to the second model based on the original model performance and the performance ratio.
For specific implementation manners of the input unit 121, the prediction unit 122, the second obtaining unit 123 and the second determining unit 124, reference may be made to the description of step S103 in the embodiment corresponding to fig. 3, and details will not be further described here.
The re-pruning module 13 is configured to perform re-pruning on the second model if the model operation performance corresponding to the second model does not reach the performance threshold;
the first determining module 14 is configured to determine that the second model is a target model for performing business processing if the model operation performance corresponding to the second model reaches the performance threshold.
The first obtaining module 15 is configured to, if the success rate of the training result is less than or equal to the success rate threshold, obtain a successful pruning model with a latest storage timestamp from a locally stored successful pruning model set; the successful pruning model is the second model with the training result success rate larger than the success rate threshold;
the updating module 16 is configured to update the pruning model based on the successful pruning model with the latest stored timestamp.
The second determining module 17 is configured to store the second model with the training result success rate greater than the success rate threshold in the successful pruning model set, and determine the second model as a successful pruning model with a latest storage timestamp in the successful pruning model set.
The second obtaining module 18 is configured to obtain model parameters of the sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
the deleting module 19 is configured to randomly delete the channel parameters in the model parameters of the sample model to obtain a sample pruning model corresponding to the sample model;
the second training module 20 is configured to train the sample pruning model according to second sample data to obtain a sample training model;
the third obtaining module 21 is configured to obtain an actual performance ratio between the sample training model and the sample model;
the second prediction module 22 is configured to predict a prediction performance ratio corresponding to the sample training model through the initial prediction model.
Wherein the second prediction module 22 is further configured to:
inputting a model topology structure of the sample training model and model parameters of the sample training model to the initial prediction model, and predicting a prediction performance ratio corresponding to the sample training model by the initial prediction model.
The adjusting module 23 is configured to adjust the initial prediction model according to the actual performance ratio and the predicted performance ratio;
the third determining module 24 is configured to determine the adjusted initial prediction model as the prediction model associated with the original model when the adjusted initial prediction model satisfies the convergence condition.
The fourth determining module 25 is configured to determine a loss value of the adjusted initial prediction model according to the actual performance ratio, the prediction performance ratio output by the adjusted initial prediction model, and the regularized loss value of the adjusted initial prediction model;
the fourth obtaining module 26 is configured to obtain a success rate of the prediction result of the adjusted initial prediction model when the loss value is smaller than the loss function threshold.
Wherein, the fourth obtaining module 26 includes: a test unit 261, a third determination unit 262, a fourth determination unit 263, and a statistics unit 264.
The testing unit 261 is configured to, when the loss value is smaller than the loss threshold function, test the adjusted initial prediction model based on a sample testing model to obtain a sample testing result corresponding to the sample testing model; the sample test model is obtained by pruning the sample model;
the third determining unit 262 is configured to determine a prediction error corresponding to the adjusted initial prediction model based on the sample test result.
Wherein the third determining unit 262 includes: a second acquisition subunit 2621, a second determination subunit 2622, a third acquisition subunit 2623, and a third determination subunit 2624.
The second obtaining subunit 2621, configured to obtain a test performance ratio in the sample test result;
the second determining subunit 2622 is configured to determine, according to the test performance ratio in the sample test result and the operation performance of the original model, the operation performance of the test model corresponding to the sample test model;
the third obtaining subunit 2623 is configured to obtain an actual model operation performance corresponding to the sample test model;
the third determining subunit 2624 is configured to determine a prediction error corresponding to the adjusted initial prediction model according to the running performance of the test model, the actual running performance of the sample test model, and the running performance of the original model.
For specific implementation manners of the second obtaining subunit 2621, the second determining subunit 2622, the third obtaining subunit 2623, and the third determining subunit 2624, reference may be made to the description of the prediction error in the embodiment corresponding to fig. 7, which will not be further described here.
The fourth determining unit 263, configured to determine the sample test result with the prediction error smaller than the error threshold as a successful sample test result;
the statistic unit 264 is configured to count the success rate of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
For specific implementation manners of the testing unit 261, the third determining unit 262, the fourth determining unit 263 and the counting unit 264, reference may be made to the description of obtaining the success rate of the predicted result in the embodiment corresponding to fig. 7, and details will not be further described here.
The fifth determining module 27 is configured to determine that the adjusted initial prediction model satisfies a convergence condition if the success rate of the prediction result of the adjusted initial prediction model is greater than the prediction success rate threshold;
the regeneration module 28 is configured to regenerate the sample pruning model to reconstruct the initial prediction model if the success rate of the prediction result of the adjusted initial prediction model is less than or equal to the prediction success rate threshold.
For specific implementation manners of the pruning module 10, the first training module 11, the first prediction module 12, the re-pruning module 13, the first determining module 14, the first obtaining module 15, the updating module 16, the second determining module 17, the second obtaining module 18, the deleting module 19, the second training module 20, the third obtaining module 21, the second prediction module 22, the adjusting module 23, the third determining module 24, the fourth determining module 25, the fourth obtaining module 26, the fifth determining module 27, and the re-generating module 28, reference may be made to the description of steps S101 to S205 in the embodiment corresponding to the foregoing fig. 3 and the description of steps S201 to S207 in the embodiment corresponding to the foregoing fig. 7, and no further description will be given here. In addition, the beneficial effects of the same method are not described in detail.
Further, please refer to fig. 12, which is a schematic diagram of a computer device according to an embodiment of the present application. As shown in fig. 12, the computer device 1000 may be the server 2000 in the corresponding embodiment of fig. 1, and the computer device 1000 may include: at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display (Display) and a Keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface and a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally also be at least one storage device located remotely from the aforementioned processor 1001. As shown in fig. 12, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the computer apparatus 1000 shown in fig. 12, the network interface 1004 is mainly used for network communication with the user terminal; the user interface 1003 is an interface for providing a user with input; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
pruning the first model to obtain a pruning model;
training the pruning model to obtain a second model, and acquiring the success rate of the training result of the second model;
if the success rate of the training result is greater than the success rate threshold, predicting the model operation performance corresponding to the second model according to the operation performance of the original model and the model topological structure of the second model;
if the model operation performance corresponding to the second model does not reach the performance threshold value, carrying out pruning treatment on the second model again;
and if the model operation performance corresponding to the second model reaches the performance threshold, determining that the second model is a target model for performing business processing.
It should be understood that the computer device 1000 described in this embodiment of the present application may perform the description of the data processing method in the embodiment corresponding to fig. 3 and fig. 7, and may also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 11, which is not described herein again. In addition, the beneficial effects of the same method are not described in detail.
Further, here, it is to be noted that: an embodiment of the present application further provides a computer-readable storage medium, where the computer program executed by the aforementioned data processing apparatus 1 is stored in the computer-readable storage medium, and the computer program includes program instructions, and when the processor executes the program instructions, the description of the data processing method in the embodiment corresponding to fig. 3 or fig. 7 can be performed, so that details are not repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims (14)

1. A method of data processing, the method comprising:
pruning the first model to obtain a pruning model;
training the pruning model to obtain a second model, and acquiring the success rate of the training result of the second model;
if the success rate of the training result is greater than the success rate threshold value, predicting the model operation performance corresponding to the second model according to the operation performance of the original model and the model topological structure of the second model;
if the model operation performance corresponding to the second model does not reach the performance threshold value, carrying out pruning treatment on the second model again;
and if the model operation performance corresponding to the second model reaches the performance threshold, determining that the second model is a target model for performing business processing.
2. The method of claim 1, wherein the pruning the first model to obtain a pruning model comprises:
acquiring the result influence degree corresponding to the channel parameter in the model parameter of the first model;
determining a pruning channel in the channel parameters based on the resulting influence degree;
and deleting the pruning channel in the first model to obtain a pruning model.
3. The method of claim 1, wherein the training the pruning model to obtain a second model, and obtaining a success rate of a training result of the second model comprises:
training the pruning model according to the first sample data to obtain a second model;
storing a training result corresponding to the first sample data output by the second model to a statistical database; the statistical database is used for storing the success rate of the training result of the second model; the training result success rate is generated according to the sample label corresponding to the first sample data and the training result statistics;
and reading the success rate of the training result of the second model from the statistical database.
4. The method of claim 1, wherein if the success rate of the training result is greater than a success rate threshold, predicting the model operation performance corresponding to the second model according to the original model operation performance and the model topology structure of the second model comprises:
if the success rate of the training result is larger than the success rate threshold value, inputting the model topological structure of the second model and the model parameters of the second model into a prediction model associated with the original model; the original model is a model associated with the first model and not subjected to pruning;
predicting, by the predictive model, a performance ratio corresponding to the second model;
performing performance test on the original model to obtain the operation performance of the original model corresponding to the original model;
and determining the model operation performance corresponding to the second model based on the original model performance and the performance ratio.
5. The method according to claim 4, wherein the performing the performance test on the original model to obtain the operation performance of the original model corresponding to the original model comprises:
performing performance test on the original model to obtain at least two initial operating performances corresponding to the original model;
carrying out mean value processing on the at least two initial running performances to obtain mean values corresponding to the at least two initial running performances;
and determining the average value as the original model operation performance.
6. The method of claim 1, further comprising:
if the success rate of the training result is smaller than or equal to the success rate threshold, acquiring a successful pruning model with the latest storage timestamp from a locally stored successful pruning model set; the successful pruning model is the second model with the training result success rate larger than the success rate threshold;
updating the pruning model based on the successful pruning model with the latest stored timestamp.
7. The method of claim 6, further comprising:
and storing the second model with the training result success rate larger than the success rate threshold value in the successful pruning model set, and determining the second model as the successful pruning model with the latest storage time stamp in the successful pruning model set.
8. The method of claim 4, further comprising:
obtaining model parameters of a sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
randomly deleting channel parameters in the model parameters of the sample model to obtain a sample pruning model corresponding to the sample model;
training the sample pruning model according to second sample data to obtain a sample training model;
obtaining an actual performance ratio between the sample training model and the sample model;
predicting a predicted performance ratio corresponding to the sample training model through an initial prediction model;
adjusting the initial prediction model according to the actual performance ratio and the predicted performance ratio;
determining the adjusted initial prediction model as the prediction model associated with the original model when the adjusted initial prediction model satisfies a convergence condition.
9. The method of claim 8, further comprising:
determining a loss value of the adjusted initial prediction model according to the actual performance ratio, the prediction performance ratio output by the adjusted initial prediction model and the regularization loss value of the adjusted initial prediction model;
when the loss value is smaller than a loss function threshold value, acquiring the success rate of the prediction result of the adjusted initial prediction model;
if the success rate of the prediction result of the adjusted initial prediction model is greater than the threshold of the success rate of the prediction, determining that the adjusted initial prediction model meets the convergence condition;
and if the success rate of the prediction result of the adjusted initial prediction model is less than or equal to the prediction success rate threshold, regenerating a sample pruning model to reconstruct the initial prediction model.
10. The method according to claim 9, wherein obtaining the prediction result success rate of the adjusted initial prediction model when the loss value is smaller than the loss function threshold comprises:
when the loss value is smaller than a loss threshold function, testing the adjusted initial prediction model based on a sample test model to obtain a sample test result corresponding to the sample test model; the sample test model is obtained by pruning the sample model;
determining a prediction error corresponding to the adjusted initial prediction model based on the sample test result;
determining the sample test result with the prediction error smaller than the error threshold value as a successful sample test result;
and counting the success rate of the prediction result of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
11. The method of claim 10, wherein determining a prediction error corresponding to the adjusted initial prediction model based on the sample test results comprises:
obtaining a test performance ratio in the sample test result;
determining the running performance of the test model corresponding to the sample test model according to the test performance ratio in the sample test result and the running performance of the original model;
acquiring actual model operation performance corresponding to the sample test model;
and determining the prediction error corresponding to the adjusted initial prediction model according to the running performance of the test model, the actual model running performance of the sample test model and the running performance of the original model.
12. A data processing apparatus, characterized in that the apparatus comprises:
the pruning module is used for carrying out pruning processing on the first model to obtain a pruning model;
the first training module is used for training the pruning model to obtain a second model and obtain the success rate of the training result of the second model;
the first prediction module is used for predicting the model operation performance corresponding to the second model according to the operation performance of an original model and the model topological structure of the second model if the success rate of the training result is greater than the success rate threshold;
the re-pruning module is used for re-pruning the second model if the running performance of the model corresponding to the second model does not reach the performance threshold;
and the first determining module is used for determining that the second model is a target model for performing business processing if the model operation performance corresponding to the second model reaches the performance threshold.
13. A computer device, comprising: a processor, a memory, a network interface;
the processor is connected to a memory for providing data communication functions, a network interface for storing a computer program, and a processor for calling the computer program to perform the method according to any one of claims 1 to 11.
14. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method according to any one of claims 1-11.
CN202010079087.1A 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium Active CN111310918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010079087.1A CN111310918B (en) 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010079087.1A CN111310918B (en) 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111310918A true CN111310918A (en) 2020-06-19
CN111310918B CN111310918B (en) 2023-07-14

Family

ID=71145490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010079087.1A Active CN111310918B (en) 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111310918B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539224A (en) * 2020-06-25 2020-08-14 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111553169A (en) * 2020-06-25 2020-08-18 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
WO2022105714A1 (en) * 2020-11-23 2022-05-27 华为技术有限公司 Data processing method, machine learning training method and related apparatus, and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368891A (en) * 2017-05-27 2017-11-21 深圳市深网视界科技有限公司 A kind of compression method and device of deep learning model
US20180173994A1 (en) * 2016-12-15 2018-06-21 WaveOne Inc. Enhanced coding efficiency with progressive representation
CN109460613A (en) * 2018-11-12 2019-03-12 北京迈格威科技有限公司 Model method of cutting out and device
US20190130272A1 (en) * 2017-10-26 2019-05-02 Uber Technologies, Inc. Generating compressed representation neural networks having high degree of accuracy
CN110059823A (en) * 2019-04-28 2019-07-26 中国科学技术大学 Deep neural network model compression method and device
WO2019186194A2 (en) * 2018-03-29 2019-10-03 Benevolentai Technology Limited Ensemble model creation and selection
CN110414673A (en) * 2019-07-31 2019-11-05 北京达佳互联信息技术有限公司 Multimedia recognition methods, device, equipment and storage medium
CN110674939A (en) * 2019-08-31 2020-01-10 电子科技大学 Deep neural network model compression method based on pruning threshold automatic search

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180173994A1 (en) * 2016-12-15 2018-06-21 WaveOne Inc. Enhanced coding efficiency with progressive representation
CN107368891A (en) * 2017-05-27 2017-11-21 深圳市深网视界科技有限公司 A kind of compression method and device of deep learning model
US20190130272A1 (en) * 2017-10-26 2019-05-02 Uber Technologies, Inc. Generating compressed representation neural networks having high degree of accuracy
WO2019186194A2 (en) * 2018-03-29 2019-10-03 Benevolentai Technology Limited Ensemble model creation and selection
CN109460613A (en) * 2018-11-12 2019-03-12 北京迈格威科技有限公司 Model method of cutting out and device
CN110059823A (en) * 2019-04-28 2019-07-26 中国科学技术大学 Deep neural network model compression method and device
CN110414673A (en) * 2019-07-31 2019-11-05 北京达佳互联信息技术有限公司 Multimedia recognition methods, device, equipment and storage medium
CN110674939A (en) * 2019-08-31 2020-01-10 电子科技大学 Deep neural network model compression method based on pruning threshold automatic search

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YULONG WANG 等: "Pruning from Scratch", 《ARXIV》, pages 1 - 12 *
彭冬亮 等: "基于GoogLeNet模型的剪枝算法", 《控制与决策》, vol. 34, no. 6, pages 1259 - 1264 *
曹仰杰 等: "基于剪枝技术的自适应PPM预测模型", 《计算机工程与应用》, pages 141 - 144 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539224A (en) * 2020-06-25 2020-08-14 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111553169A (en) * 2020-06-25 2020-08-18 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111539224B (en) * 2020-06-25 2023-08-25 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111553169B (en) * 2020-06-25 2023-08-25 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
WO2022105714A1 (en) * 2020-11-23 2022-05-27 华为技术有限公司 Data processing method, machine learning training method and related apparatus, and device

Also Published As

Publication number Publication date
CN111310918B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
CN111310918B (en) Data processing method, device, computer equipment and storage medium
CN110766080B (en) Method, device and equipment for determining labeled sample and storage medium
CN111352965B (en) Training method of sequence mining model, and processing method and equipment of sequence data
CN111008640A (en) Image recognition model training and image recognition method, device, terminal and medium
CN112052948B (en) Network model compression method and device, storage medium and electronic equipment
CN116049412B (en) Text classification method, model training method, device and electronic equipment
CN112036564B (en) Picture identification method, device, equipment and storage medium
CN111814975A (en) Pruning-based neural network model construction method and related device
CN115221396A (en) Information recommendation method and device based on artificial intelligence and electronic equipment
CN113128671A (en) Service demand dynamic prediction method and system based on multi-mode machine learning
CN112200208B (en) Cloud workflow task execution time prediction method based on multi-dimensional feature fusion
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN110855474B (en) Network feature extraction method, device, equipment and storage medium of KQI data
CN112463964B (en) Text classification and model training method, device, equipment and storage medium
CN115982634A (en) Application program classification method and device, electronic equipment and computer program product
CN114970357A (en) Energy-saving effect evaluation method, system, device and storage medium
CN114861917A (en) Knowledge graph inference model, system and inference method for Bayesian small sample learning
CN113836005A (en) Virtual user generation method and device, electronic equipment and storage medium
CN113360772A (en) Interpretable recommendation model training method and device
CN109308565B (en) Crowd performance grade identification method and device, storage medium and computer equipment
CN112052386A (en) Information recommendation method and device and storage medium
CN113688989B (en) Deep learning network acceleration method, device, equipment and storage medium
CN113011893B (en) Data processing method, device, computer equipment and storage medium
CN117456286B (en) Ginseng grading method, device and equipment
He et al. Model Training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40024278

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant