CN111310918B - Data processing method, device, computer equipment and storage medium - Google Patents

Data processing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN111310918B
CN111310918B CN202010079087.1A CN202010079087A CN111310918B CN 111310918 B CN111310918 B CN 111310918B CN 202010079087 A CN202010079087 A CN 202010079087A CN 111310918 B CN111310918 B CN 111310918B
Authority
CN
China
Prior art keywords
model
pruning
sample
performance
success rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010079087.1A
Other languages
Chinese (zh)
Other versions
CN111310918A (en
Inventor
余翀
张银锋
张应国
邓巍然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010079087.1A priority Critical patent/CN111310918B/en
Publication of CN111310918A publication Critical patent/CN111310918A/en
Application granted granted Critical
Publication of CN111310918B publication Critical patent/CN111310918B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application discloses a data processing method, a device and a storage medium, wherein the method comprises the following steps: and pruning the first model to obtain a pruning model. And training the pruning model to obtain a second model, and obtaining the success rate of the training result of the second model. And if the success rate of the training result is greater than the success rate threshold, predicting the model running performance corresponding to the second model according to the original model running performance and the model topological structure of the second model. And if the running performance of the model corresponding to the second model does not reach the performance threshold, carrying out pruning again on the second model. And if the model running performance corresponding to the second model reaches the performance threshold, determining the second model as a target model for business processing. By adopting the embodiment of the application, automatic pruning can be realized, so that the efficiency of model pruning can be improved.

Description

Data processing method, device, computer equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data processing method, a data processing device, a computer device, and a storage medium.
Background
Various business processes such as image recognition, artificial intelligence game, and voice recognition can be performed through the deep learning neural network model. Because the neural network model contains a large number of model parameters, the neural network model needs to consume a large amount of storage space. Therefore, in order to reduce the storage overhead of the neural network model, model pruning is required for the neural network model.
Currently, the neural network model is usually pruned by means of manual pruning. It can be understood that when the neural network model is pruned, the model running performance of the neural network model after the pruning is manually tested, so as to obtain the neural network model with the model running performance reaching the performance threshold. However, in the case of manual pruning, it is often necessary to obtain a neural network model that reaches the performance threshold through multiple pruning processes. Because the model operation performance of the neural network model after pruning is required to be manually tested each time the neural network model is pruned, the whole process of manual pruning is long in time consumption, and the model pruning efficiency is reduced.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, computer equipment and a storage medium, which can improve the pruning efficiency of a model.
An aspect of an embodiment of the present application provides a data processing method, including:
pruning is carried out on the first model to obtain a pruning model;
training the pruning model to obtain a second model, and obtaining the success rate of training results of the second model;
if the success rate of the training result is greater than the success rate threshold, predicting the model running performance corresponding to the second model according to the original model running performance and the model topological structure of the second model;
if the running performance of the model corresponding to the second model does not reach the performance threshold, pruning the second model again;
and if the model running performance corresponding to the second model reaches the performance threshold, determining the second model as a target model for business processing.
The pruning processing is performed on the first model to obtain a pruning model, which comprises the following steps:
obtaining the result influence degree corresponding to the middle channel parameter corresponding to the model parameter in the first model;
determining pruning channels in the channel parameters based on the resultant influence;
And deleting the pruning channel in the first model to obtain a pruning model.
The training of the pruning model to obtain a second model and obtaining a training result success rate of the second model comprise the following steps:
training the pruning model according to the first sample data to obtain a second model;
storing training results corresponding to the first sample data output by the second model into a statistical database; the statistical database is used for storing the training result success rate of the second model; the success rate of the training result is generated according to the sample label corresponding to the first sample data and the training result statistics;
and reading the training result success rate of the second model from the statistical database.
If the success rate of the training result is greater than the success rate threshold, predicting the model operation performance corresponding to the second model according to the original model operation performance and the model topology structure of the second model, including:
if the success rate of the training result is greater than the success rate threshold, inputting the model topological structure of the second model and the model parameters of the second model into a prediction model related to the original model; the original model is a model which is associated with the first model and is not subjected to pruning treatment;
Predicting a performance ratio corresponding to the second model by the prediction model;
performing performance test on the original model to obtain the running performance of the original model corresponding to the original model;
and determining the model running performance corresponding to the second model based on the original model performance and the performance ratio.
The performance test for the original model is performed to obtain the running performance of the original model corresponding to the original model, and the performance test includes:
performing performance test on the original model to obtain at least two initial running performances corresponding to the original model;
carrying out average value processing on the at least two initial running performances to obtain an average value corresponding to the at least two initial running performances;
and determining the average value as the running performance of the original model.
Wherein the method further comprises:
if the success rate of the training result is smaller than or equal to the success rate threshold, acquiring a successful pruning model with the latest storage time stamp from a locally stored successful pruning model set; the successful pruning model refers to the second model with the success rate of the training result being greater than the success rate threshold;
Updating the pruning model based on the successful pruning model with the latest stored timestamp.
Wherein the method further comprises:
and storing the second model with the training result success rate larger than the success rate threshold in the successful pruning model set, and determining the second model as a successful pruning model with the latest storage time stamp in the successful pruning model set.
Wherein the method further comprises:
obtaining model parameters of a sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
randomly deleting channel parameters in model parameters in a model topological structure of the sample model to obtain a sample pruning model corresponding to the sample model;
training the sample pruning model according to second sample data to obtain a sample training model;
acquiring an actual performance ratio between the sample training model and the sample model;
predicting a prediction performance ratio corresponding to the sample training model through an initial prediction model;
adjusting the initial prediction model according to the actual performance ratio and the predicted performance ratio;
And when the adjusted initial prediction model meets the convergence condition, determining the adjusted initial prediction model as a prediction model associated with the original model.
Wherein predicting, by the initial prediction model, a prediction performance ratio corresponding to the sample training model includes:
and inputting the model topological structure of the sample training model and the model parameters of the sample training model into the initial prediction model, and predicting the prediction performance ratio corresponding to the sample training model through the initial prediction model.
Wherein the method further comprises:
determining a loss value of the adjusted initial prediction model according to the actual performance ratio, the prediction performance ratio output by the adjusted initial prediction model and the regularized loss value of the adjusted initial prediction model;
when the loss value is smaller than a loss function threshold value, obtaining a prediction result success rate of the adjusted initial prediction model;
if the success rate of the prediction result of the adjusted initial prediction model is greater than a prediction success rate threshold value, determining that the adjusted initial prediction model meets a convergence condition;
and if the success rate of the prediction result of the adjusted initial prediction model is smaller than or equal to the threshold value of the prediction success rate, regenerating a sample pruning model to reconstruct the initial prediction model.
Wherein when the loss value is smaller than a loss function threshold, obtaining a prediction result success rate of the adjusted initial prediction model includes:
when the loss value is smaller than a loss threshold function, testing the adjusted initial prediction model based on a sample test model to obtain a sample test result corresponding to the sample test model; the sample test model is obtained by pruning the sample model;
determining a prediction error corresponding to the adjusted initial prediction model based on the sample test result;
determining the sample test result with the prediction error smaller than the error threshold value as a successful sample test result;
and counting the success rate of the prediction results of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
Wherein determining the prediction error corresponding to the adjusted initial prediction model based on the sample test result includes:
obtaining a test performance ratio in the sample test result;
determining the test model operation performance corresponding to the sample test model according to the test performance ratio in the sample test result and the original model operation performance;
Acquiring the actual model running performance corresponding to the sample test model;
and determining a prediction error corresponding to the adjusted initial prediction model according to the test model running performance, the actual model running performance of the sample test model and the original model running performance.
In one aspect, a data processing apparatus is provided, which is integrated in a computer device and includes:
the pruning module is used for pruning the first model to obtain a pruning model;
the first training module is used for training the pruning model to obtain a second model and obtaining the success rate of training results of the second model;
the first prediction module is used for predicting the model running performance corresponding to the second model according to the original model running performance and the model topological structure of the second model if the success rate of the training result is greater than a success rate threshold;
the pruning module is used for pruning the second model again if the running performance of the model corresponding to the second model does not reach the performance threshold;
and the first determining module is used for determining the second model as a target model for business processing if the model running performance corresponding to the second model reaches the performance threshold.
Wherein, this pruning module includes:
the first acquisition unit is used for acquiring the effect of the result corresponding to the channel parameter in the first model;
a first determining unit configured to determine a pruning channel in the channel parameters based on the effect of the result;
and the deleting unit is used for deleting the pruning channel in the first model to obtain a pruning model.
Wherein, this first training module includes:
the training unit is used for training the pruning model according to the first sample data to obtain a second model;
the storage unit is used for storing training results corresponding to the first sample data output by the second model into a statistical database; the statistical database is used for storing the training result success rate of the second model; the success rate of the training result is generated according to the sample label corresponding to the first sample data and the training result statistics;
and the reading unit is used for reading the training result success rate of the second model from the statistical database.
Wherein the first prediction module comprises:
the input unit is used for inputting the model topological structure of the second model and the model parameters of the second model into a prediction model related to the original model if the success rate of the training result is greater than a success rate threshold value; the original model is a model which is associated with the first model and is not subjected to pruning treatment;
A prediction unit configured to predict a performance ratio corresponding to the second model by using the prediction model;
the second acquisition unit is used for performing performance test on the original model to acquire the running performance of the original model corresponding to the original model;
and the second determining unit is used for determining the model running performance corresponding to the second model based on the original model performance and the performance ratio.
Wherein the second acquisition unit includes:
the first acquisition subunit is used for performing performance test on the original model to acquire at least two initial running performances corresponding to the original model;
the average value processing subunit is used for carrying out average value processing on the at least two initial running performances to obtain an average value corresponding to the at least two initial running performances;
and the first determination subunit is used for determining the average value as the original model running performance.
Wherein the apparatus further comprises:
the first acquisition module is used for acquiring a successful pruning model with the latest storage timestamp from a locally stored successful pruning model set if the success rate of the training result is smaller than or equal to the success rate threshold; the successful pruning model refers to the second model with the success rate of the training result being greater than the success rate threshold;
And the updating module is used for updating the pruning model based on the successful pruning model with the latest stored time stamp.
Wherein the apparatus further comprises:
and the second determining module is used for storing the second model with the training result success rate larger than the success rate threshold value in the successful pruning model set, and determining the second model as a successful pruning model with the latest storage time stamp in the successful pruning model set.
Wherein the apparatus further comprises:
the second acquisition module is used for acquiring model parameters of the sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
the deleting module is used for randomly deleting the channel parameters in the model parameters of the sample model to obtain a sample pruning model corresponding to the sample model;
the second training module is used for training the sample pruning model according to second sample data to obtain a sample training model;
a third obtaining module, configured to obtain an actual performance ratio between the sample training model and the sample model;
the second prediction module is used for predicting the prediction performance ratio corresponding to the sample training model through the initial prediction model;
The adjusting module is used for adjusting the initial prediction model according to the actual performance ratio and the prediction performance ratio;
and a third determining module, configured to determine the adjusted initial prediction model as a prediction model associated with the original model when the adjusted initial prediction model meets a convergence condition.
Wherein the second prediction module is further configured to:
and inputting the model topological structure of the sample training model and the model parameters of the sample training model into the initial prediction model, and predicting the prediction performance ratio corresponding to the sample training model through the initial prediction model.
Wherein the apparatus further comprises:
a fourth determining module, configured to determine a loss value of the adjusted initial prediction model according to the actual performance ratio, the prediction performance ratio output by the adjusted initial prediction model, and a regularized loss value of the adjusted initial prediction model;
a fourth obtaining module, configured to obtain a success rate of a prediction result of the adjusted initial prediction model when the loss value is smaller than a loss function threshold;
a fifth determining module, configured to determine that the adjusted initial prediction model meets a convergence condition if a success rate of a prediction result of the adjusted initial prediction model is greater than a prediction success rate threshold;
And the regeneration module is used for regenerating the sample pruning model to reconstruct the initial prediction model if the success rate of the prediction result of the adjusted initial prediction model is smaller than or equal to the prediction success rate threshold value.
Wherein, this fourth acquisition module includes:
the test unit is used for testing the adjusted initial prediction model based on a sample test model when the loss value is smaller than a loss threshold function to obtain a sample test result corresponding to the sample test model; the sample test model is obtained by pruning the sample model;
a third determining unit, configured to determine a prediction error corresponding to the adjusted initial prediction model based on the sample test result;
a fourth determining unit, configured to determine, as a successful sample test result, a sample test result with the prediction error smaller than an error threshold;
and the statistics unit is used for counting the success rate of the prediction result of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
Wherein the third determination unit includes:
the second acquisition subunit is used for acquiring the test performance ratio in the sample test result;
The second determining subunit is used for determining the test model running performance corresponding to the sample test model according to the test performance ratio in the sample test result and the original model running performance;
the third acquisition subunit is used for acquiring the actual model running performance corresponding to the sample test model;
and the third determination subunit is used for determining the prediction error corresponding to the adjusted initial prediction model according to the running performance of the test model, the actual model running performance of the sample test model and the original model running performance.
In one aspect, the present application provides a computer device comprising: a processor, a memory, a network interface;
the processor is connected to a memory for providing data communication functions, a network interface for storing a computer program, and for invoking the computer program to perform the method in the above aspect of the embodiments of the application.
An aspect of the present application provides a computer readable storage medium storing a computer program comprising program instructions which, when executed by a processor, perform a method in the above aspect of embodiments of the present application.
In the embodiment of the application, pruning treatment and training can be performed on the first model, so that a second model can be obtained, and the training success rate of the second model can be obtained. Further, according to the embodiment of the application, the model running performance of the second model does not need to be tested manually, and the model running performance of the second model can be predicted rapidly according to the model topological structure and the original model running performance of the second model, so that the time consumption in the pruning processing process is short, and the pruning efficiency of the model is improved. Then, according to the embodiment of the application, automatic pruning can be performed on the first model according to the success rate threshold and the performance threshold in the pruning judging condition, so that a balance point can be quickly found between the success rate threshold and the performance threshold, and the pruning efficiency of the model can be further improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application;
fig. 2 is a schematic view of a scenario for service data interaction according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application;
FIG. 4 is a system block diagram provided in an embodiment of the present application;
fig. 5 is a schematic diagram of a success rate of a statistical training result according to an embodiment of the present application;
FIG. 6 is a schematic view of a scenario for determining a model operation performance corresponding to a second model according to an embodiment of the present application;
FIG. 7 is a flowchart of a method for determining a predictive model according to an embodiment of the present application;
FIG. 8 is a schematic view of a scenario for recording actual performance ratios corresponding to a sample training model according to an embodiment of the present disclosure;
FIG. 9 is a flow chart of an automated pruning provided in an embodiment of the present application;
FIG. 10 is a scene graph of data interaction using a target model provided in an embodiment of the present application;
FIG. 11 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 12 is a schematic diagram of a computer device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application. As shown in fig. 1, the network architecture may include a server 2000 and a user terminal cluster, which may include a plurality of user terminals, as shown in fig. 1, and may specifically include a user terminal 3000a, a user terminal 3000b, user terminals 3000c, …, and a user terminal 3000n.
As shown in fig. 1, the user terminals 3000a, 3000b, 3000c, …, 3000n may respectively perform network connection with the above-mentioned server 2000, so that each user terminal may perform data interaction with the server 2000 through the network connection.
As shown in fig. 1, each user terminal in the user terminal cluster may be provided with a target application, and when the target application runs in each user terminal, data interaction may be performed between the target application and the server 2000 shown in fig. 1, where the target application may be an application that performs service processing through a target model, and the target model may perform service processing such as image recognition, artificial intelligence game, and voice recognition.
It should be understood that artificial intelligence (Artificial Intelligence, AI for short) refers to a new technical science of simulating, extending and expanding human intelligence by using a digital computer or a machine controlled by a data computer (e.g., server 2000 shown in fig. 1). In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Among them, machine Learning (ML) is a multi-domain interdisciplinary, and involves multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, and the like. It may be appreciated that the prediction model used to determine the prediction performance ratio corresponding to the model in the embodiments of the present application may be obtained through machine learning.
For easy understanding, in the embodiment of the present application, one user terminal may be selected from the plurality of user terminals shown in fig. 1 as a target user terminal, where the target user terminal may include: smart terminals with video data processing functions such as smart phones, tablet computers, desktop computers and the like. For example, in the embodiment of the present application, the user terminal 3000a shown in fig. 1 may be used as the target user terminal, where the target user terminal may be integrated with the target application, and at this time, the target user terminal may implement data interaction between the service data platform corresponding to the target application and the server 2000.
For easy understanding, further, please refer to fig. 2, which is a schematic view of a scenario for service data interaction according to an embodiment of the present application. The computer device in the embodiment of the present application may be an entity terminal with a model optimization function, where the entity terminal may be a server or a user terminal. The embodiment of the application takes the server 2000 shown in fig. 1 as an example to illustrate the process of model optimization of the neural network model by the computer device.
Wherein the raw model processed by the computer device may be a neural network model that has been trained. It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the application does not limit the business processing which can be performed by the neural network model. In the embodiment of the application, a neural network model for performing the service processing of image recognition is taken as an example, so as to describe a process of performing model optimization on the neural network model by the computer device.
It will be appreciated that the computer device may perform initial training on the neural network model for performing the service of image recognition, and when the success rate of the training result of the neural network model is greater than the success rate threshold in the pruning decision condition as shown in fig. 2, the computer device may determine that the initial training of the neural network model is successful. The training result success rate is generated according to sample labels corresponding to sample data for training the neural network model and training result statistics. When the success rate of the training result is greater than the success rate threshold, it can be understood that the neural network model meets the convergence condition, that is, the training is successful. It should be appreciated that embodiments of the present application may refer to a neural network model that was successfully first trained as the original model.
As shown in fig. 2, the computer device may determine the original model as a first model that requires pruning. In this embodiment of the present application, a neural network model that needs to be pruned may be referred to as a first model. Further, the computer device may perform pruning processing on the first model to obtain a pruned first model. In this embodiment of the present application, the first model obtained after pruning may be referred to as a pruning model. It should be appreciated that the computer device may train the pruning model multiple times based on the first sample data to obtain a trained pruning model. The pruning model after training may be referred to as a second model in the embodiment of the present application. It may be appreciated that, when the pruning model is trained for the last time, the computer device may store the training result corresponding to the first sample data output by the second model into a statistics database, where the statistics database is used to store a training result success rate of the second model. The training result success rate is generated according to the sample label corresponding to the first sample data and the training result statistics;
It should be appreciated that in evaluating the second model, the computer device may make the determination based on the success rate threshold and the performance threshold in pruning determination conditions, as shown in fig. 2. Wherein the success rate threshold and the performance threshold may be set by a user. For example, the user may set the success rate threshold to 95% and the performance threshold to 850.
It will be appreciated that the computer device may read the success rate of training results for the second model from the statistical database when evaluating the second model. If the training result success rate read by the computer device from the statistical database is 93%, in other words, the training result success rate does not reach the success rate threshold (i.e. 95%). At this point, the computer device may obtain the successful pruning model with the latest stored timestamp from the locally stored set of successful pruning models as shown in fig. 2. The successful pruning model refers to a second model with the success rate of the training result being greater than the success rate threshold. At this point, the computer device may update the pruning model based on the successful pruning model with the latest stored timestamp. In other words, the computer device may train the successful pruning model with the latest stored timestamp.
If the training result success rate read by the computer equipment from the statistical database is 98%, that is, the training result success rate reaches the success rate threshold. At this point, the computer device may store the second model in a local set of successful pruning models and determine the second model as the successful pruning model in the set of successful pruning models having the latest stored timestamp.
Further, when the success rate of the training result of the second model is greater than the success rate threshold, the computer device may predict the performance of the second model based on the prediction model shown in fig. 2, to obtain the model running performance corresponding to the second model. The model operation performance corresponding to the second model may be a query rate per second (Queries Per Second, abbreviated as QPS) corresponding to the second model, where the query rate per second refers to the number of times the second model performs service query in one second.
It will be appreciated that if the model operation performance corresponding to the second model determined by the computer device is 800, in other words, since the second model may be a model for performing image recognition, the second model may recognize 800 images in 1 second, that is, the time required for recognizing one image is 1.18ms, it may be understood that the model operation performance corresponding to the second model does not reach the performance threshold (for example, 850). At this time, the computer device may re-prune the second model, i.e., the computer device may determine the second model as the first model to be pruned, so that the first model may be pruned. It can be understood that the first model in the embodiment of the present application may be the original model, or may be the second model whose running performance does not reach the performance threshold.
If the determined model operational performance of the second model is 880, that is, the model operational performance of the second model reaches the performance threshold (e.g., 850). At this time, the computer device may determine the second model as a target model for performing business processes.
It should be appreciated that the target model may be used in the target application of any of the user terminals (e.g., user terminal 3000 a) in the user terminal cluster illustrated in fig. 1 above. The target application can adopt the target model to perform the service processing of image recognition. As shown in fig. 2, when the image 10 is input into the target model corresponding to the target application, the target application may determine that the output result of the image 10 is an animal through the target model. And inputting the image 20 into a target model corresponding to the target application, and determining that the output result of the image 20 is a character by the target application through the target model.
The specific implementation manner of the computer device for performing model optimization on the first model based on the success rate threshold and the performance threshold in the pruning judgment condition can be referred to the embodiments corresponding to fig. 3-9 below.
Further, please refer to fig. 3, which is a flowchart illustrating a data processing method according to an embodiment of the present application. The method can be applied to computer equipment with a model optimization function. As shown in fig. 3, the method may include:
s101, pruning is carried out on the first model, and a pruning model is obtained.
Specifically, the computer device may obtain a resultant influence value corresponding to each of the model parameters in the first model. The first model may be a neural network model that performs pruning processing. It should be appreciated that the computer device may order the channel parameters (e.g., down-order or up-order) corresponding to the resulting impact, such that pruning channels may be determined. Further, the computer device may delete the pruning channel from the first model, thereby obtaining a pruning model.
Wherein one of the channel parameters may correspond to one of the matrix dimensions in the weight matrix of the model parameter. The resulting influence of a channel parameter may be determined from the elements in the dimension of the weight matrix to which the channel parameter corresponds. It should be understood that the resultant influence may be a gradient value of all matrix elements in the dimension of the corresponding weight matrix, or may be an average value of all matrix elements in the dimension of the corresponding weight matrix, which is not limited herein. The result influence degree refers to the influence degree of the channel parameter corresponding to the result influence degree on the output result of the first model. It is understood that the greater the resultant influence degree, the greater the influence of the channel parameter corresponding to the resultant influence degree on the output result of the first model.
The computer device in the embodiment of the present application may be an entity terminal with a model optimization (for example, model pruning) function, where the entity terminal may be a server or a user terminal. The embodiment of the application takes the server 2000 shown in fig. 1 as an example to illustrate the process of model optimization of the neural network model by the computer device. It will be appreciated that the neural network model may be a model for image recognition, a model for artificial intelligence games, a model for speech recognition, etc. The embodiment of the application does not limit the business processing which can be performed by the neural network model.
For ease of understanding, further, please refer to fig. 4, which is a system configuration diagram provided in an embodiment of the present application. As shown in fig. 4, the system architecture diagram may include a training layer, an evaluation layer, and a pruning layer.
It should be understood that data transmission can be performed among the training layer, the evaluation layer and the pruning layer in the computer equipment, so that automatic pruning is realized, and a target model for business processing is finally output. It will be appreciated that the computer device may import a neural network model into the training layer for training, such that a neural network model that was successfully first trained, i.e., the original model (e.g., model a), may be obtained. The embodiment of the application can determine the model A as a first model, and input the model A into the pruning layer for pruning processing, so that a pruning model (for example, a model B) can be obtained. It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the application does not limit the business processing which can be performed by the neural network model.
It is understood that the model a may include a plurality of model parameters, where each model parameter of the plurality of model parameters corresponds to a weight matrix, and the number of channel parameters in each model parameter corresponds to the number of dimensions of the weight matrix. For example, model a may include 3 model parameters, namely model parameter a, model parameter b, and model parameter c. The model parameter a may include a 256-dimensional weight matrix a, the model parameter b may include a 256-dimensional weight matrix b, and the model parameter c may include a 128-dimensional weight matrix c. In other words, the model parameter a may contain 256 channel parameters, the model parameter b may contain 256 channel parameters, and the model parameter c may contain 128 channel parameters.
It should be understood that, in the embodiment of the present application, gradient values of all matrix elements in a certain dimension in a weight matrix may be used as the resultant influence degree of the channel parameter corresponding to the dimension. For example, the resulting influence of the first channel parameter in the model parameter a may be the gradient values of all matrix elements in the first dimension of the weight matrix a. The resulting effect of the second channel parameters in the model parameter a may be the gradient values of all matrix elements in the second dimension of the weight matrix a. And so on, further description will not be provided herein.
Further, the computer device may determine a pruning channel from the channel parameters according to the influence of the result when performing pruning processing. It should be understood that the computer device may sort the resultant influence degrees of all the three model parameters respectively corresponding to the channel parameters, and then delete the K channel parameters with smaller resultant influence degrees (i.e. reduce the dimension of the corresponding weight matrix). Where K may be a positive integer. In order to prevent the number of pruning processes from being excessive, which may result in excessive pruning of the model when the computer device performs the next pruning process, the number of channel parameters deleted each time the computer device performs a pruning process may be different. As the number of pruning processes increases, the number of channel parameters pruned by the computer device may decrease in turn. In other words, the more pruning is performed, the fewer the number of channel parameters are deleted during pruning. For example, the computer device may obtain 8 channel parameters with less impact on the result and delete the 8 channel parameters from the first model when pruning is first performed. In the second pruning process, the computer device may obtain 6 channel parameters with less influence on the result, and further delete the 6 channel parameters.
Optionally, the computer device may perform a descending process on the resultant influence degrees corresponding to the channel parameters in each model parameter, and then delete the K channel parameters with smaller resultant influence degrees (i.e. reduce the dimensions of the corresponding weight matrix) in each model parameter. Where K may be a positive integer. For example, model parameter a in model a may include a 256-dimensional weight matrix a, model parameter b may include a 256-dimensional weight matrix b, and model parameter c may include a 128-dimensional weight matrix c. In other words, the model parameter a may contain 256 channel parameters, the model parameter b may contain 256 channel parameters, and the model parameter c may contain 128 channel parameters. When the computer device performs pruning processing on the model a, 2 channel parameters with smaller effect on the result can be deleted from the model parameter B, and 2 channel parameters with smaller effect on the result can be deleted from the model parameter c, so that a pruning model (for example, model B) can be obtained. The model parameter a in the model B may include 254 channel parameters, the corresponding weight matrix a is 254 dimensions, the model parameter B may include 254 channel parameters, the corresponding weight matrix B is 254 dimensions, the model parameter c may include 126 channel parameters, and the corresponding weight matrix is 126 dimensions. In addition, the number of channel parameters deleted by the computer device in each model parameter may also be different when pruning the model.
S102, training the pruning model to obtain a second model, and obtaining the success rate of the training result of the second model.
Specifically, the computer device may train the pruning model multiple times according to the first sample data to obtain a pruning model after training is completed. The pruning model after training may be referred to as a second model in the embodiment of the present application. At this time, the computer device may store the training result corresponding to the first sample data output from the second model in a statistical database. The statistical database is used for storing the training result success rate of the second model; the training result success rate is generated according to the sample label corresponding to the first sample data and the training result statistics. Further, the computer device may read the training result success rate of the second model from the statistical database.
It should be appreciated that the computer device may train the pruning model multiple times based on the first sample data so that a second model may be derived. It will be appreciated that the computer device may adjust matrix elements of the weight matrix corresponding to each model parameter in the pruning model. As shown in fig. 4, the computer device may input the model B into the training layer for training to obtain a second model (e.g., model C), and further, the computer device may obtain a training result success rate of the model C.
It will be appreciated that the model B is obtained by pruning the model a by the computer device. At this time, the computer device may train the model B a plurality of times based on the sample data a (first sample data), so that the trained model B may be obtained. The pruning model after training may be referred to as a second model (e.g., model C).
It should be understood that the computer device may store the training results corresponding to the sample data a output by the model C to a statistical database. Further, the computer device may read the training result success rate of the model C from the statistical database.
For easy understanding, further, please refer to fig. 5, which is a schematic diagram of a success rate of a statistical training result provided in an embodiment of the present application. As shown in fig. 5, the model 300 may be a second model of the computer device for counting training result success rates at the current time. The original model associated with the model 300 may be the model 100.
It will be appreciated that the model 100 may be derived from a neural network model introduced into the computer device after multiple training. The neural network model may be a model for image recognition, among other things. Further, the computer device may train the model 100 such that a training result success rate of the model 100 may be determined. Further, the computer device may store the training result success rate (i.e., training result success rate 1) of the model 100 in a statistical database as shown in fig. 5. It will be appreciated that when the model 100 training result success rate is greater than the success rate threshold of the pruning decision condition, the computer device may store the model 100 in a local set of successful pruning models and determine the model 100 as the successful pruning model of the set of successful pruning models having the latest stored timestamp. Then, the computer device may determine the model operation performance of the model 100, and when the model operation performance of the model 100 does not reach the performance threshold of the pruning determination condition, may perform pruning processing on the model 100, and perform training on the model 100 after pruning processing multiple times, so that the model 200 may be obtained.
Further, the computer device may evaluate the model 200, so that a success rate of training results of the model 200 may be obtained. Further, the computer device may store the training result success rate (i.e., training result success rate 2) of the model 200 in a statistical database as shown in fig. 5. It will be appreciated that when the training result success rate of the model 200 is greater than the success rate threshold in the pruning decision condition, the computer device may further store the model 200 in a local set of successful pruning models, and determine the model 200 as the successful pruning model in the set of successful pruning models with the latest stored timestamp. Further, the computer device may determine the model operation performance of the model 200, and when the model operation performance of the model 200 does not reach the performance threshold in the pruning decision condition, the computer device may perform pruning processing on the model 200 and perform training on the pruned model 200 multiple times, so that the model 300 as shown in fig. 4 may be obtained.
The computer device in the embodiment of the present application may be a computer device corresponding to fig. 4. It should be appreciated that the embodiment of the present application takes the model 300 as an example to illustrate the process of the computer device counting the success rate of the training results of the second model. It will be appreciated that the computer device may input the sample data a (i.e., the first sample data) shown in fig. 4 into the model 300, to obtain the training result corresponding to the sample data a. For example, the sample data a may include 100 pieces of image data, and may include: image data 1, image data 2, …, image data 100. The training result corresponding to the image data 1 is a training result 1, the training result corresponding to the image data 2 is a training result 2, and so on.
It should be understood that the training result corresponding to the sample data a output by the model 300 may be output in the form of a log file (a data output form). At this time, a distributed storage system (Kafka Cluster, system for data storage) and a data statistics tool (logstack, tool for data statistics) as shown in fig. 4 may statistically analyze the training results output from the model 300, thereby generating a training result success rate 3 of the model 300.
It will be appreciated that the computer device may match training results corresponding to sample data a in the model 300 with sample tags corresponding to the sample data a. For example, if the sample tag corresponding to the image data 1 is a person, and the training result obtained by the image data 1 through the model 300 is also a person, the computer device may determine that the training result of the image data 1 is successfully matched with the sample tag of the image data 1, and further may add a process to the number of times of successful matching to obtain the number of successful matching. In addition, if the sample tag corresponding to the image data 2 is a person, and the training result obtained by the image data 2 through the model 300 is a plant, the computer device may determine that the training result of the image data 2 fails to match with the sample tag, and further may add a process to the number of matching failures to obtain the number of matching failures.
It should be appreciated that the computer device may determine the training result success rate (i.e., training result success rate 3) of the model 300 based on the ratio between the number of matches successful and the total number of sample data (i.e., the sum of the number of matches successful and the number of matches failed). For example, the computer device may input 100 image data in the sample data a into the model 300, respectively, so that a training result corresponding to each of the 100 image data may be obtained. Further, the computer device performs statistical analysis on the label information of each image data and the corresponding training results based on the distributed storage system and the data statistics tool, so that the number of successful matching can be determined to be 98, and the computer device can determine that the training result success rate 3 of the model 300 is 98%.
At this point, the computer device may store the training result success rate 3 of the model 300 in a statistical database inventory. Wherein the training result success rate 3 has the latest stored timestamp. Further, when the computer device reads the success rate of the training results of the model 300, a service request for querying the success rate of the training results of the model 300 may be sent to the statistics database through the search client as shown in fig. 5. Then, the statistical database may obtain the training result success rate 3 with the latest timestamp based on the service request, and return the training result success rate 3 to the search client. At this time, the training result success rate obtained by the computer device to the model 300 may be training result success rate 3.
And S103, if the success rate of the training result is greater than a success rate threshold, predicting the model operation performance corresponding to the second model according to the original model operation performance and the model topological structure of the second model.
Specifically, if the success rate of the training result is greater than the success rate threshold, the computer device may input the model topology of the second model and the model parameters of the second model to a prediction model associated with the original model. The original model may be a model that is associated with the first model and has not undergone pruning processing. At this time, the computer device may predict the performance ratio corresponding to the second model through the prediction model. Further, the computer device can perform performance test on the original model, so that the original model running performance corresponding to the original model can be obtained. The computer device may then determine a model operating performance corresponding to the second model based on the raw model performance and the performance ratio.
It should be appreciated that the computer device may input the model C into the evaluation layer to evaluate the success rate of the training results of the model C, as shown in fig. 4. If the training result success rate of model C (second model) is greater than the success rate threshold (e.g., 95%) in the pruning decision condition, the computer device may store the model C into a local set of successful pruning models. Further, the computer device may determine the model C as the successful pruning model of the set of successful pruning models with the latest stored timestamp. For example, if the training result success rate of the model C is 98%, the computer device may store the model C in a local successful pruning model set, and determine the model C as the successful pruning model with the latest stored timestamp in the successful pruning model set.
Furthermore, the computer device may evaluate the model running performance of the model C in the evaluation layer. The running performance of the model can be a query rate per second (i.e. QPS) and is used for representing the number of times the model performs service query in one second. For ease of understanding, further, please refer to fig. 6, which is a schematic diagram of a scenario for determining the model operation performance corresponding to the second model according to an embodiment of the present application. The original model in the embodiment of the present application may be the model a in the embodiment corresponding to fig. 4, and the second model may be the model C in the embodiment corresponding to fig. 4.
It will be appreciated that the computer device may obtain the original model runnability corresponding to the original model. In order to reduce errors of the computer equipment caused by temperature, occupancy rate and other reasons, the computer equipment can perform performance tests on the original model for multiple times, so that the original model running performance corresponding to the original model can be obtained more accurately. It should be appreciated that the computer device may perform performance testing on the original model such that at least two initial operating performances corresponding to the original model may be obtained. Further, the computer device may perform a mean value process on the at least two initial running performances to obtain a mean value corresponding to the at least two initial running performances, and may use the mean value as an original model running performance corresponding to the original model.
For example, the computer device may perform a performance test on the model a so that at least two (e.g., 5) initial operating performances may be obtained. For example, the initial running performance 1 corresponding to the model a may be 765, the model running performance 2 corresponding to the model a may be 762, the initial running performance 3 corresponding to the model a may be 775, the initial running performance 4 corresponding to the model a may be 783, and the initial running performance 5 corresponding to the model a may be 765. Further, the computer device may perform a mean value process on the 5 initial running performances to obtain a mean value (i.e., 770) corresponding to the 5 initial running performances. At this time, the computer device may take the average value as the model running performance (i.e., the original model running performance) corresponding to the model a. It will be appreciated that when the computer device performs a mean value process on at least two initial operating performances, the resulting mean value may be a non-integer. At this time, the computer device may perform an approximation process (e.g., rounding) on the average value of the non-integer number, to obtain an average value after the approximation process, and may use the average value after the approximation process as the running performance of the original model corresponding to the original model.
Further, the computer device may obtain a model topology of a second model (e.g., model C) and model parameters of the model C. Further, the computer device may input the model topology of the model C and the model parameters of the model C to a prediction model associated with an original model (e.g., model a), and may further predict a performance ratio corresponding to the model C through the prediction model.
Wherein the prediction model may be used to predict a performance ratio between a model operating performance of the second model and an original model operating performance corresponding to the original model. The model topology of the sample model in the predictive model is the same as the model topology of the original model. It is understood that the original model may be a model associated with the first model and not subjected to pruning, and the original model in the embodiment of the present application may be a model a (i.e., the first model) as shown in fig. 4.
For example, the model topology of the model C may include a first sub-topology and a second sub-topology. Wherein the first sub-topology may be a convolutional neural network, cnn [256, 256, 128, 128]; the second sub-topology may be a deep neural network, dnn [256, 256, 128, 128]. At this time, the computer device may determine 8 model parameters according to the model topology of the model C, and further, may input the 8 model parameters to a prediction model associated with the original model (e.g., model a), and may further predict a performance ratio corresponding to the model C through the prediction model. It is understood that the performance ratio corresponding to the model C refers to the model running performance corresponding to the model C and the original model running performance corresponding to the original model (e.g., model a).
Further, the computer device determines the model running performance corresponding to the model C based on the performance ratio corresponding to the model C and the original model running performance corresponding to the original model (model a). Specifically, the computer device determines the model operation performance corresponding to the second model as shown in the following formula (1):
per i =perf-ratio*per ori , (1)
wherein per i For the model operation performance corresponding to the second model, perf-ratio is the performance ratio corresponding to the second model, per ori The original model running performance corresponding to the original model is obtained.
For example, the performance ratio of the computer device corresponding to the model C predicted by the prediction model may be 1.2, the running performance of the original model corresponding to the original model may be 770, and then the running performance of the model corresponding to the model C determined by the above formula (1) is 924. It will be appreciated that the model C may perform service queries 924 times in a second. In other words, if the model C is a model for image recognition, the model C can recognize 924 images in one second and obtain recognition results corresponding to the 924 images, respectively.
Alternatively, as shown in fig. 4, if the training result success rate of model C (second model) is less than or equal to the success rate threshold (e.g., 95%) in the pruning decision condition, the computer device may obtain the successful pruning model with the latest timestamp from the locally stored set of successful pruning models. Further, the computer device may update the pruning model based on the successful pruning model with the latest stored timestamp, i.e. may retrain the successful pruning model with the latest stored timestamp, and determine the successfully pruned model with the latest stored timestamp after the training is completed as the second model.
For example, in an embodiment corresponding to fig. 5, if the training result success rate of the model 300 determined by the computer device is 93%, the computer device may obtain a successful pruning model (e.g., the model 200) with the latest stored timestamp from among the locally stored successful pruning models. Further, the computer device may update the model 200 to the pruning model, retrain the model 200 to obtain a retrained model 200, and may determine the retrained model 200 as the second model.
And S104, if the running performance of the model corresponding to the second model does not reach the performance threshold, pruning the second model again.
Specifically, if the model running performance corresponding to the second model does not reach the performance threshold in the pruning judgment condition, the computer device may perform pruning processing on the second model. In other words, the second model whose model running performance does not reach the performance threshold may be updated to the first model, and pruning processing may be performed on the updated first model, so as to obtain a pruned model after pruning processing.
It should be understood that, as shown in fig. 4, if the model running performance corresponding to the model C, which is obtained by the computer device through the above-described prediction model, is 840, the computer device may determine that the model running performance corresponding to the model C does not reach the performance threshold in the pruning decision condition (for example, 850. At this time, the computer device may determine the model C as a first model (for example, model A1). Further, the computer device may perform a re-pruning process on the model A1 to obtain a pruned model (for example, model B1). Then, the computer device may perform training on the model B1 a plurality of times and determine the trained model B1 as a second model (for example, model C1).
And S105, if the model running performance corresponding to the second model reaches the performance threshold, determining the second model as a target model for business processing.
Specifically, if the model operation performance corresponding to the second model obtained by the computer device through the prediction model reaches the performance threshold value in the pruning judgment condition, the computer device may determine the second model as a target model for performing service processing. The object model may be applied to any one of the user terminals in the user terminal cluster corresponding to fig. 1, for example, the user terminal 3000a.
It should be appreciated that, as shown in fig. 4, if the model running performance corresponding to the model C predicted by the computer device through the above prediction model is 900, the computer device may determine that the model running performance corresponding to the model C reaches the performance threshold (for example, 850) in the pruning determination condition. At this time, the computer device may determine the model C as a target model for performing business processing, and export the target model out of the computer device. It will be appreciated that the computer device may send the model C to any one of the user terminals in the cluster of user terminals as shown in fig. 1, according to a network connection relationship.
In the embodiment of the application, the computer equipment can perform pruning treatment and training on the first model, so that a second model can be obtained, and the training success rate of the second model can be obtained. Furthermore, the computer equipment does not need to manually test the model running performance of the second model, and can predict the corresponding performance ratio of the second model according to the prediction model, so that the model running performance of the second model can be quickly obtained according to the performance ratio and the original model running performance, and the time consumption of the pruning processing process can be short. Then, the computer device can automatically prune the first model according to the success rate threshold value and the performance threshold value in the pruning judging condition so as to quickly find a balance point between the success rate threshold value and the performance threshold value, thereby improving the pruning efficiency of the model.
Further, please refer to fig. 7, which is a flowchart illustrating a method for determining a prediction model according to an embodiment of the present application. As shown in fig. 7, the method may include:
s201, obtaining model parameters of a sample model.
In particular, the computer device may obtain model parameters of the sample model. It will be appreciated that the model parameters of the sample model have the same model topology as the model parameters of the original model described above. The original model may be the original model in the embodiment corresponding to fig. 3, and the original model may be a model associated with the first model and not subjected to pruning.
The computer device in the embodiment of the present application may be an entity terminal with a model optimization (for example, model pruning) function, where the entity terminal may be a server or a user terminal. The embodiment of the application takes the server 2000 shown in fig. 1 as an example to illustrate the process of model optimization of the neural network model by the computer device. It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the application does not limit the business processing which can be performed by the neural network model.
For ease of understanding, further, please refer to fig. 8, which is a schematic diagram of a scenario for recording actual performance ratios corresponding to a sample training model according to an embodiment of the present application. The sample model 10 in the embodiment of the present application may be associated with the original model (for example, the model a) in the embodiment corresponding to fig. 4.
As shown in fig. 8, in order to more accurately predict the model operation performance corresponding to the second model associated with the above-described original model, the computer device may acquire the model topology of the original model (i.e., model a) in the embodiment corresponding to fig. 4. The model topology may include a model parameter a, a model parameter b, and a model parameter c. Further, the computer device may create a set of neural network models having the same model topology based on the model topology of model a. In this embodiment of the present application, a neural network model created by a computer device according to a model topology structure of an original model may be referred to as a sample model. At this point, the computer device may obtain model parameters of the sample model 10 to pruning the sample model.
S202, randomly deleting channel parameters in the model parameters of the sample model to obtain a sample pruning model corresponding to the sample model.
Specifically, the computer device may randomly delete the channel parameters in the model topology of the sample model according to the model topology of the sample model, so as to obtain a sample pruning model corresponding to the sample model.
It should be appreciated that, as shown in fig. 8, to obtain a more generalized sample pruning model 20, the computer device may randomly delete channel parameters (i.e., the dimensions of the weight matrix to which the sample model corresponds) among the model parameters in the model topology of the sample model 10. The process of randomly deleting the model topology of the sample model 10 is to simulate the process of pruning the original model. It will be appreciated that the model parameters of the sample model 10 may be randomly initialized during the simulation of the pruning process to obtain a large number of sample pruned models 20, so as to obtain a prediction model capable of more accurately predicting the model running performance.
And S203, training the sample pruning model according to the second sample data to obtain a sample training model.
Specifically, the computer device may train the sample pruning model multiple times according to the second sample data, and determine the sample pruning model after training as the sample training model.
It should be appreciated that as shown in fig. 8, the computer device may refer to sample data used to train the sample pruning model 20 as second sample data. The original model may be a neural network model for performing image recognition processing. Therefore, the sample data may be a plurality of image data (for example, 100 pieces of image data) for training the sample pruning model 20, so that the sample training model 30 may be obtained after the training is completed. For a specific implementation manner of training the sample pruning model 20, reference may be made to the training of the pruning model in step S102 in the embodiment corresponding to fig. 3, so as to obtain the description of the second model.
S204, acquiring an actual performance ratio between the sample training model and the sample model.
Specifically, the computer device may perform multiple performance tests on the sample training model to obtain at least two initial model running performances corresponding to the sample training model. Further, the computer device may perform mean value processing on at least two initial model running performances corresponding to the sample training model, to obtain an average value of at least two initial model running performances corresponding to the sample training model, and use the average value as the model running performance corresponding to the sample training model. The computer device may then determine an actual performance ratio between the sample training model and the sample model based on the model performance of the sample model (i.e., the original model performance corresponding to the original model) and the model performance corresponding to the sample training model.
It should be appreciated that as shown in FIG. 8, the computer device may perform multiple performance tests on the sample training model 30 such that at least two (e.g., 5) initial model runnability of the sample training model 30 may be obtained. Further, the computer device may perform a mean value processing on the 5 initial model running performances corresponding to the sample training model 30, to obtain an average value (e.g., 780) of at least two initial model running performances corresponding to the sample training model 30, and use the average value as the model running performance corresponding to the sample training model 30. The computer device may then determine that the actual performance ratio between the sample training model 30 and the sample model 10 is 1.013 based on the model performance of the sample model 10 (i.e., model a) (original model performance, e.g., 770) and the model performance corresponding to the sample training model. Further, the computer device may record the mapping relationship between the actual performance ratio and the model parameters of the sample training model in a record table as shown in fig. 8.
It will be appreciated that the computer device may obtain a corresponding plurality of sample pruning models 20 by modifying the model parameters in the model topology of the sample model 10 a plurality of times, and may obtain a corresponding plurality of sample training models 30 by training the plurality of sample pruning models 20. Further, the computer device may record, in a record table, a mapping relationship between model parameters in the model topology of the plurality of sample training models 30 and actual performance ratios of the plurality of sample training models 30, to perform training sample data as an initial prediction model, so that a prediction model for predicting the performance ratio corresponding to the second model may be obtained.
S205, predicting the prediction performance ratio corresponding to the sample training model through an initial prediction model.
Specifically, the computer device may input the model topology of the sample training model and model parameters of the sample training model into an initial prediction model by which a prediction performance ratio corresponding to the sample training model is predicted.
It should be appreciated that the computer device determining the predictive performance ratio of the sample training model for input to the initial predictive model may be as shown in equation (2) below:
Figure SMS_1
wherein y is the predictive performance ratio of the sample training model of the initial predictive model, and x i-1 Representing input of full connection of the ith layer in forward propagation, where x 0 Representing the input value of the initial predictive model. w (w) i Representing the initial predictive modelModel parameters of the i-th layer.
It should be understood that in the embodiments of the present application, an initial prediction model may be constructed according to multiple fully connected layers to obtain a prediction model. The predictive model is in effect a design of a nonlinear regression model between model parameters in a model topology of a sample training model input to the initial predictive model and a performance ratio corresponding to the sample training model.
It will be appreciated that the computer device may input the model topology of the sample training model 30 and the model parameters of the sample training model 30 as shown in fig. 8 into the initial prediction model, from which a prediction performance ratio corresponding to the sample training model 30 may be predicted.
S206, adjusting the initial prediction model according to the actual performance ratio and the predicted performance ratio.
Specifically, the computer device may train the actual performance ratio and the predicted performance ratio of the model based on the sample, and calculate model parameters (i.e., w in equation (2) i ) And adjusting to obtain an adjusted initial prediction model.
It will be appreciated that the computer device may train the actual performance ratio and the predicted performance ratio of the model 30 for the sample shown in FIG. 8 above, to model parameters of the initial predictive model (i.e., w in equation (2) above) i ) And adjusting to obtain an adjusted initial prediction model.
And S208, when the adjusted initial prediction model meets the convergence condition, determining the adjusted initial prediction model as a prediction model associated with the original model.
Specifically, the computer device determines a loss value for the adjusted initial predictive model based on an actual performance ratio, a predicted performance ratio for the adjusted initial predictive model, and a regularized loss value for the adjusted initial predictive model. When the loss value is less than the loss function threshold, the computer device may obtain a prediction result success rate of the adjusted initial prediction model. When the predicted outcome success rate of the adjusted initial prediction model is greater than the predicted outcome success rate threshold, the computer device may determine the adjusted initial prediction model as the prediction model associated with the original model in the embodiment corresponding to fig. 3.
It should be appreciated that since the initial predictive model is a nonlinear regression model, the most common loss function for the computer device is the sum of squares formula. Specifically, the loss function of the initial prediction model may be represented by the following formula (3):
Figure SMS_2
wherein the loss function of the formula (3) is composed of two parts. Y in a part i Representing the predicted performance ratio, y, of the ith sample training model predicted by the adjusted initial prediction model -i Representing the actual performance ratio corresponding to the ith sample training model. loss of loss i Is the regularized loss value of the adjusted initial predictive model. The regularization loss value can effectively prevent the condition of parameter overfitting in the adjusted initial prediction model.
Wherein it is understood that the computer device may input the sample training model into the adjusted initial prediction model, so as to output an adjusted prediction performance ratio corresponding to the sample training model. The computer device may then obtain regularized loss values of the adjusted initial predictive model. Further, the computer device may determine the loss value of the adjusted initial prediction model according to the actual performance ratio of the sample training model, the adjusted predicted performance ratio of the sample training model, and the regularized loss value of the adjusted initial prediction model through the above formula (3).
When the loss value of the adjusted initial predictive model is greater than or equal to the loss function threshold, then the computer device may adjust the model parameters of the adjusted initial predictive model (i.e., w in equation (2) above i ) To obtain a prediction model capable of accurately predicting the running performance of the model.
When the loss value of the adjusted initial prediction model is less than the loss function threshold, the computer device may obtain a prediction result success rate of the adjusted initial prediction model, and when the prediction result success rate of the adjusted initial prediction model is greater than the prediction threshold, the computer device may determine the adjusted initial prediction model as a prediction model associated with the original model.
It should be appreciated that the computer device may determine the prediction result success rate of the adjusted initial prediction model by determining the prediction error of the adjusted initial prediction model. It can be understood that the computer device may perform pruning processing on the sample model, that is, randomly delete model parameters in the model topology structure of the sample model, so as to obtain a sample test model. Further, the computer device may test the adjusted initial prediction model based on the sample test model, so as to obtain a sample test result corresponding to the sample test model. Further, the computer device may determine a prediction error of the adjusted initial prediction model based on the sample test results.
The computer equipment can acquire the test performance ratio in the sample test result, and determine the test model operation performance corresponding to the sample test model according to the test performance ratio in the sample test result and the original model operation performance through the formula (1). Further, the computer device may perform a performance test on the sample test model to obtain an actual model running performance corresponding to the sample test model. Further, the computer device may determine a prediction error corresponding to the adjusted initial prediction model according to the test model operation performance, the actual model operation performance corresponding to the sample test model, and the original model operation performance. Specifically, the computer device determines the prediction error as shown in the following formula (4):
Figure SMS_3
wherein per infe Representing the test model runnability of the sample test model, per act Representing the actual model performance of the sample test model, per ori And (5) representing the running performance of the original model, and dev representing the prediction error corresponding to the sample test model.
Further, the computer device may determine a sample test result corresponding to the sample test model having the prediction error greater than or equal to an error threshold (e.g., 0.05) as a failed sample test result, and count the number of failed sample test results. For example, the test model running performance determined by the sample test model a through the adjusted initial prediction model is 780, and the computer device performs a performance test on the sample test model a, so that the actual model running performance is 850. Wherein the original model has a performance 770. At this time, the computer device may determine that the prediction error determined according to the above formula (4) is 0.09, in other words, the prediction error corresponding to the sample test model a is greater than the error threshold, and the computer device may consider the sample test result corresponding to the sample test model a to be a failed sample test result.
Further, the computer device may determine a sample test result corresponding to the sample test model having the prediction error less than an error threshold (e.g., 0.05) as a successful sample test result, and count the number of successful sample test results. For example, the test model running performance determined by the sample test model B through the adjusted initial prediction model is 880, and the computer device performs the performance test on the sample test model a, and the actual model running performance obtained is 860. Wherein the original model has a performance 770. At this time, the computer device may determine that the prediction error determined according to the above formula (4) is 0.026, in other words, the prediction error corresponding to the sample test model a is smaller than the error threshold, and the computer device may consider the sample test result corresponding to the sample test model a to be a successful sample test result.
Further, the computer device may determine a ratio between the total number of the sample test results (i.e., a sum of the number of successful sample test results and the number of failed sample test results) and the number of successful sample test results, and determine the ratio as a success rate of the prediction results of the adjusted initial prediction model.
If the success rate of the prediction result of the adjusted initial prediction model is less than or equal to the threshold value of the prediction success rate, the computer equipment can perform pruning treatment on the sample model again to generate a sample pruning model so as to reconstruct the initial prediction model.
If the success rate of the prediction result of the adjusted initial prediction model is greater than the threshold of the success rate of prediction, the computer device may determine that the adjusted initial prediction model satisfies a convergence condition, and determine the adjusted initial prediction model satisfying the convergence condition as a prediction model associated with the original model.
It can be appreciated that the embodiment of the application can perform performance test on multiple sets of test data (for example, 15 sets of test models with different pruning degrees) respectively through a prediction model and an artificial performance test mode, so that the tested differences of the two performance test modes can be intuitively compared. The test model may be obtained by pruning and training the original model.
It should be understood that the computer device may perform the manual performance test on the 15 sets of test models, so that the model running performance corresponding to the 15 sets of test models may be obtained, and further, the computer device may obtain the actual performance ratio corresponding to the 15 sets of test models through the above formula (1) according to the original model running performance corresponding to the original model.
Further, the computer device may input the 15 sets of test models into the above-mentioned prediction models, respectively, so that the prediction performance ratios corresponding to the 15 sets of test models, respectively, may be directly obtained. The computer equipment can count the actual performance ratio and the predicted performance ratio corresponding to each test model in the 15 groups of test models, so that an effect comparison graph of the predicted performance ratio and the actual performance ratio obtained by the 15 groups of test models according to the two performance test modes can be obtained, and further the difference between the two performance test modes can be intuitively seen.
For example, the predicted performance ratio of test model A through the performance model may be 1.47, and the actual performance ratio of test model A through the manual performance test may be 1.50. It can be appreciated that if the original model running performance corresponding to the test model a is 770, the actual model running performance according to the test model a can be 1150, and the predicted model running performance corresponding to the test model a can be 1131. In other words, the actual time determined by the test model a to identify an image is 0.86ms, and the predicted time determined by the test model a to identify an image may be 0.88ms. Therefore, the predicted performance ratio obtained by the predicted model and the actual performance ratio obtained by the artificial performance test of the test model A are not greatly different, but the time for the artificial performance test is long, so that the predicted model in the embodiment of the application is used for replacing the artificial performance test, and the efficiency of the operation performance of the predicted model can be improved.
In the embodiment of the application, in order to construct a prediction model for more accurately predicting the performance ratio corresponding to the second model, the computer device may acquire the model topology structure of the original model associated with the second model to obtain sample data serving as an initial prediction model, and further construct a prediction model for predicting the performance ratio of the model, so that the running performance of the model corresponding to the second model may be rapidly determined.
Further, please refer to fig. 9, which is a flowchart of a method for automatic pruning according to an embodiment of the present application. As shown in fig. 9, the method can be applied to a computer device having a model optimization function. As shown in fig. 9, the method may include:
s301, importing the neural network model into computer equipment, and training the neural network model to obtain an original model.
S302, determining an original model as a first model, and pruning the first model to obtain a pruning model.
And S303, training the pruning model to obtain a second model.
S304, obtaining the success rate of the training result of the second model, and matching with a success rate threshold in pruning judgment conditions.
And S305, if the success rate of the training result does not reach the success rate threshold, acquiring a successful pruning model with the latest storage time stamp from the locally stored successful pruning model set, so that the computer equipment trains the successful pruning model with the latest storage time stamp again.
And S306, if the success rate of the training result reaches the success rate threshold, storing the second model into a successful pruning model set, and updating the successful pruning model with the latest storage time stamp.
S307, predicting the model running performance corresponding to the second model, matching with a performance threshold value in pruning judging conditions, and if the model running performance does not reach the performance threshold value, carrying out pruning again on the second model.
And S308, when the model running performance reaches the performance threshold, the second model is taken as a target model to be exported from the computer equipment.
The computer device in the embodiment of the present application may be an entity terminal with a model optimization (for example, model pruning) function, where the entity terminal may be a server or a user terminal. The embodiment of the application takes the server 2000 shown in fig. 1 as an example to illustrate the process of model optimization of the neural network model by the computer device. It is understood that the neural network model may be a model for performing image recognition, a model for performing an artificial intelligence game, a model for performing voice recognition, or the like. The embodiment of the application does not limit the business processing which can be performed by the neural network model.
In the embodiment of the present application, the specific implementation of the process of automatic pruning may refer to the description of step S101-step S105 in the embodiment corresponding to fig. 3 and step S201-step S207 in the embodiment corresponding to fig. 7, and will not be further described herein.
Further, please refer to fig. 10, which is a schematic diagram of data interaction using the object model according to an embodiment of the present application. The computer device in the embodiment of the present application may be an entity terminal with a model optimization function, and the entity terminal may be the server 2000 shown in fig. 1. The original model in the embodiment of the application can be a model for playing an artificial intelligence game.
The computer equipment can take the original model as a first model, and further can perform pruning processing on the model, so that a pruning model can be obtained. Further, the computer device may train the pruning model so that a second model may be obtained. At this time, the computer device may automatically perform pruning processing on the second model based on the success rate threshold and the performance threshold in the pruning determination condition, so that a target model output to the computer device may be obtained. It should be understood that, for a specific implementation manner of the automatic pruning process of the original model by the computer device, reference may be made to the description of step S101 to step S105 in the embodiment corresponding to fig. 3, and the detailed description will not be repeated here.
It should be understood that the computer device may send the object model to a user terminal a having a network connection relationship with the computer device. The user terminal a may be any one of the user terminals in the user terminal cluster shown in fig. 1, for example, the user terminal 3000a. It will be appreciated that the object model may be used in the object application of the user terminal a. The target application can adopt the target model to conduct business processing of the artificial intelligent game.
It will be appreciated that the object model may be applied in a scenario where a game user of game scenario a is engaged with a robot. As shown in fig. 10, the display interface 100 of the user terminal a may include playing cards held by a game player corresponding to the user terminal a and playing cards held by a robot that is engaged with the game player. The robot can call the target model according to the playing of the game user, so that the corresponding playing card can be calculated quickly.
For example, the game user selects a "3" card at a time, at which point the robot may input the "3" card into the target model, which may calculate a "5" card, for example, corresponding to the "3" card in the held playing cards. Further, the object model may output the "5" to the display interface 100, in other words, the robot may discard a "5".
Alternatively, the game user selects two "6" cards at a time, at which point the robot may input the two "6" cards into the target model, which may calculate the card outcomes corresponding to the two "6" cards, e.g., two "8" cards, in the held playing cards. Further, the object model may output two "8" s to the display interface 100, in other words, the robot may discard two "8" s.
It can be understood that the target model obtained by the computer device is obtained by performing model optimization on the original model, and compared with the original model, the storage space and the memory bandwidth of the model are reduced, so that the target model can be more easily applied to the terminal device, and the target model can be operated locally without performing game AI processing through a background server. It should be understood that, since the terminal device can perform the game AI processing through the target model, network communication is not required, so that the embodiment of the application can reduce the occurrence of the situation that the game operation is blocked due to the network difference, and further can improve the speed of the game operation.
In the embodiment of the application, the computer equipment can perform pruning treatment and training on the first model, so that a second model can be obtained, and the training success rate of the second model can be obtained. Furthermore, the computer equipment does not need to manually test the model running performance of the second model, and can predict the corresponding performance ratio of the second model according to the prediction model, so that the model running performance of the second model can be quickly obtained according to the performance ratio and the original model running performance, and the time consumption of the pruning processing process can be short. Then, the computer device can automatically prune the first model according to the success rate threshold value and the performance threshold value in the pruning judging condition so as to quickly find a balance point between the success rate threshold value and the performance threshold value, thereby improving the pruning efficiency of the model.
Further, please refer to fig. 11, which is a schematic diagram illustrating a structure of a data processing apparatus according to an embodiment of the present application. The data processing means may be a computer program (comprising program code) running in a computer device, for example, the data processing means is an application software; the data processing device may be used to perform the corresponding steps in the methods provided by the embodiments of the present application. As shown in fig. 11, the data processing apparatus 1 may include: pruning module 10, first training module 11, first prediction module 12, re-pruning module 13, first determination module 14, first acquisition module 15, updating module 16, second determination module 17, second acquisition module 18, deletion module 19, second training module 20, third acquisition module 21, second prediction module 22, adjustment module 23, third determination module 24, fourth determination module 25, fourth acquisition module 26, fifth determination module 27, and regeneration module 28.
The pruning module 10 is configured to perform pruning processing on the first model to obtain a pruned model.
Wherein, this pruning module 10 includes: a first acquisition unit 101, a first determination unit 102, and a deletion unit 103.
The first obtaining unit 101 is configured to obtain a result influence corresponding to the channel parameter in the first model;
The first determining unit 102 is configured to determine a pruning channel in the channel parameters based on the effect of the result;
the deleting unit 103 is configured to delete the pruning channel in the first model to obtain a pruning model.
The specific implementation manner of the first obtaining unit 101, the first determining unit 102, and the deleting unit 103 may refer to the description of step S101 in the embodiment corresponding to fig. 3, and the detailed description will not be repeated here.
The first training module 11 is configured to train the pruning model to obtain a second model, and obtain a training result success rate of the second model.
Wherein the first training module 11 comprises: training unit 111, storage unit 112 and reading unit 113.
The training unit 111 is configured to train the pruning model according to the first sample data to obtain a second model.
The storage unit 112 is configured to store a training result corresponding to the first sample data output by the second model in a statistics database; the statistical database is used for storing the training result success rate of the second model; the success rate of the training result is generated according to the sample label corresponding to the first sample data and the training result statistics;
The reading unit 113 is configured to read the training result success rate of the second model from the statistics database.
The specific implementation manner of the training unit 111, the storage unit 112, and the reading unit 113 may refer to the description of step S102 in the embodiment corresponding to fig. 3, and the detailed description will not be repeated here.
The first prediction module 12 is configured to predict, according to the original model running performance and the model topology structure of the second model, the model running performance corresponding to the second model if the success rate of the training result is greater than a success rate threshold.
Wherein the first prediction module 12 comprises: an input unit 121, a prediction unit 122, a second acquisition unit 123, and a second determination unit 124.
The input unit 121 is configured to input the model topology of the second model and the model parameters of the second model to a prediction model associated with the original model if the success rate of the training result is greater than a success rate threshold; the original model is a model which is associated with the first model and is not subjected to pruning treatment;
a prediction unit 122 configured to predict a performance ratio corresponding to the second model by the prediction model;
The second obtaining unit 123 is configured to perform a performance test on the original model, and obtain the running performance of the original model corresponding to the original model.
Wherein the second acquisition unit 123 includes: a first acquisition subunit 1231, a mean processing subunit 1232, and a first determination subunit 1233.
The first obtaining subunit 1231 is configured to perform a performance test on the original model, and obtain at least two initial running performances corresponding to the original model;
the average value processing subunit 1232 is configured to perform average value processing on the at least two initial running performances to obtain an average value corresponding to the at least two initial running performances;
the first determining subunit 1233 is configured to determine the average value as the running performance of the original model.
The specific implementation manner of the first obtaining subunit 1231, the average processing subunit 1232, and the first determining subunit 1233 may refer to the description of obtaining the running performance of the original model in the embodiment corresponding to fig. 3, which will not be further described herein.
The second determining unit 124 is configured to determine a model running performance corresponding to the second model based on the original model performance and the performance ratio.
The specific implementation manner of the input unit 121, the prediction unit 122, the second obtaining unit 123, and the second determining unit 124 may be referred to the description of step S103 in the embodiment corresponding to fig. 3, and the detailed description will not be repeated here.
The pruning module 13 is configured to prune the second model again if the model running performance corresponding to the second model does not reach the performance threshold;
the first determining module 14 is configured to determine the second model as a target model for performing service processing if the model running performance corresponding to the second model reaches the performance threshold.
The first obtaining module 15 is configured to obtain, from a locally stored set of successful pruning models, a successful pruning model with a latest stored timestamp if the success rate of the training result is less than or equal to the success rate threshold; the successful pruning model refers to the second model with the success rate of the training result being greater than the success rate threshold;
the updating module 16 is configured to update the pruning model based on the successful pruning model with the latest stored time stamp.
The second determining module 17 is configured to store the second model with the training result success rate greater than the success rate threshold in the successful pruning model set, and determine the second model as a successful pruning model with the latest stored timestamp in the successful pruning model set.
The second obtaining module 18 is configured to obtain model parameters of the sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
the deleting module 19 is configured to randomly delete a channel parameter in the model parameters of the sample model to obtain a sample pruning model corresponding to the sample model;
the second training module 20 is configured to train the sample pruning model according to second sample data to obtain a sample training model;
the third obtaining module 21 is configured to obtain an actual performance ratio between the sample training model and the sample model;
the second prediction module 22 is configured to predict, by using an initial prediction model, a prediction performance ratio corresponding to the sample training model.
Wherein the second prediction module 22 is further configured to:
and inputting the model topological structure of the sample training model and the model parameters of the sample training model into the initial prediction model, and predicting the prediction performance ratio corresponding to the sample training model through the initial prediction model.
The adjustment module 23 is configured to adjust the initial prediction model according to the actual performance ratio and the predicted performance ratio;
The third determining module 24 is configured to determine the adjusted initial prediction model as a prediction model associated with the original model when the adjusted initial prediction model meets a convergence condition.
The fourth determining module 25 is configured to determine a loss value of the adjusted initial prediction model according to the actual performance ratio, the predicted performance ratio output by the adjusted initial prediction model, and a regularized loss value of the adjusted initial prediction model;
the fourth obtaining module 26 is configured to obtain a success rate of the prediction result of the adjusted initial prediction model when the loss value is smaller than a loss function threshold.
Wherein the fourth acquisition module 26 comprises: a test unit 261, a third determination unit 262, a fourth determination unit 263 and a statistics unit 264.
The testing unit 261 is configured to test the adjusted initial prediction model based on a sample test model when the loss value is smaller than a loss threshold function, so as to obtain a sample test result corresponding to the sample test model; the sample test model is obtained by pruning the sample model;
the third determining unit 262 is configured to determine a prediction error corresponding to the adjusted initial prediction model based on the sample test result.
Wherein the third determining unit 262 includes: a second acquisition subunit 2621, a second determination subunit 2622, a third acquisition subunit 2623, and a third determination subunit 2624.
The second obtaining subunit 2621 is configured to obtain a test performance ratio in the sample test result;
the second determining subunit 2622 is configured to determine a test model operation performance corresponding to the sample test model according to a test performance ratio in the sample test result and the original model operation performance;
the third obtaining subunit 2623 is configured to obtain an actual model running performance corresponding to the sample test model;
the third determining subunit 2624 is configured to determine a prediction error corresponding to the adjusted initial prediction model according to the test model running performance, the actual model running performance of the sample test model, and the original model running performance.
The specific implementation manner of the second obtaining subunit 2621, the second determining subunit 2622, the third obtaining subunit 2623 and the third determining subunit 2624 may be referred to the above description of the prediction error in the embodiment corresponding to fig. 7, and will not be further described herein.
The fourth determining unit 263 is configured to determine a sample test result with the prediction error smaller than the error threshold as a successful sample test result;
the statistics unit 264 is configured to count the success rate of the prediction result of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
The specific implementation manner of the test unit 261, the third determination unit 262, the fourth determination unit 263 and the statistics unit 264 may refer to the description of the success rate of obtaining the prediction result in the embodiment corresponding to fig. 7, and will not be further described herein.
The fifth determining module 27 is configured to determine that the adjusted initial prediction model meets a convergence condition if the success rate of the prediction result of the adjusted initial prediction model is greater than a prediction success rate threshold;
the regeneration module 28 is configured to regenerate the sample pruning model to reconstruct the initial prediction model if the prediction result success rate of the adjusted initial prediction model is less than or equal to the prediction success rate threshold.
The specific implementation manner of the pruning module 10, the first training module 11, the first prediction module 12, the pruning module 13, the first determination module 14, the first obtaining module 15, the updating module 16, the second determination module 17, the second obtaining module 18, the deleting module 19, the second training module 20, the third obtaining module 21, the second prediction module 22, the adjusting module 23, the third determination module 24, the fourth determination module 25, the fourth obtaining module 26, the fifth determination module 27 and the regeneration module 28 may be referred to the description of the steps S101-S205 in the embodiment corresponding to fig. 3 and the steps S201-S207 in the embodiment corresponding to fig. 7, and will not be repeated here. In addition, the description of the beneficial effects of the same method is omitted.
Further, please refer to fig. 12, which is a schematic diagram of a computer device according to an embodiment of the present application. As shown in fig. 12, the computer device 1000 may be the server 2000 in the corresponding embodiment of fig. 1, and the computer device 1000 may include: at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, a memory 1005, at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), a Keyboard (Keyboard), and the network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the aforementioned processor 1001. As shown in fig. 12, an operating system, a network communication module, a user interface module, and a device control application program may be included in the memory 1005, which is one type of computer storage medium.
In the computer device 1000 shown in fig. 12, the network interface 1004 is mainly used for network communication with a user terminal; while user interface 1003 is primarily used as an interface for providing input to a user; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
Pruning is carried out on the first model to obtain a pruning model;
training the pruning model to obtain a second model, and obtaining the success rate of training results of the second model;
if the success rate of the training result is greater than the success rate threshold, predicting the model running performance corresponding to the second model according to the original model running performance and the model topological structure of the second model;
if the running performance of the model corresponding to the second model does not reach the performance threshold, pruning the second model again;
and if the model running performance corresponding to the second model reaches the performance threshold, determining the second model as a target model for business processing.
It should be understood that the computer device 1000 described in the embodiment of the present application may perform the description of the data processing method in the embodiment corresponding to fig. 3 and fig. 7, and may also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 11, which is not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.
Furthermore, it should be noted here that: the embodiments of the present application further provide a computer readable storage medium, in which the aforementioned computer program executed by the data processing apparatus 1 is stored, and the computer program includes program instructions, when executed by the processor, can execute the description of the data processing method in the embodiment corresponding to fig. 3 or fig. 7, and therefore, a detailed description will not be given here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application.
Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of computer programs, which may be stored on a computer-readable storage medium, and which, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
The foregoing disclosure is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the claims herein, as the equivalent of the claims herein shall be construed to fall within the scope of the claims herein.

Claims (14)

1. A method of data processing, the method comprising:
pruning is carried out on the first model to obtain a pruning model; the business processing performed by the first model comprises image recognition;
training the pruning model according to the image data contained in the first sample data to obtain a second model, and obtaining the success rate of the training result of the second model; the training result success rate is determined based on a ratio between the number of successful matches and the total amount of image data contained in the first sample data; the successful matching means that the sample label of the image data is consistent with the training result of the image data; the sample tag of the image data is used for indicating the actual classification of the image data; the training result of the image data refers to the prediction classification obtained after the image data is input into the second model;
If the success rate of the training result is greater than a success rate threshold, predicting the model running performance corresponding to the second model according to the original model running performance and the model topological structure of the second model;
if the model running performance corresponding to the second model does not reach the performance threshold, pruning the second model again;
and if the model running performance corresponding to the second model reaches the performance threshold, determining the second model as a target model for image recognition.
2. The method of claim 1, wherein pruning the first model to obtain a pruned model comprises:
obtaining the effect of the result corresponding to the channel parameter in the model parameters of the first model;
determining pruning channels in the channel parameters based on the resultant influence;
and deleting the pruning channel in the first model to obtain a pruning model.
3. The method according to claim 1, wherein training the pruning model according to the image data included in the first sample data to obtain a second model, and obtaining a training result success rate of the second model includes:
Training the pruning model according to the image data contained in the first sample data to obtain a second model;
storing training results corresponding to the first sample data output by the second model into a statistical database; the statistical database is used for storing the training result success rate of the second model;
and reading the training result success rate of the second model from the statistical database.
4. The method according to claim 1, wherein if the training result success rate is greater than a success rate threshold, predicting the model operation performance corresponding to the second model according to the original model operation performance and the model topology of the second model includes:
if the success rate of the training result is greater than a success rate threshold, inputting a model topological structure of the second model and model parameters of the second model into a prediction model associated with an original model; the original model is a model which is associated with the first model and is not pruned;
predicting a performance ratio corresponding to the second model by the prediction model;
performing performance test on the original model to obtain the running performance of the original model corresponding to the original model;
And determining the model running performance corresponding to the second model based on the original model performance and the performance ratio.
5. The method of claim 4, wherein performing the performance test on the original model to obtain the original model operation performance corresponding to the original model comprises:
performing performance test on the original model to obtain at least two initial running performances corresponding to the original model;
performing average value processing on the at least two initial running performances to obtain an average value corresponding to the at least two initial running performances;
and determining the average value as the running performance of the original model.
6. The method as recited in claim 1, further comprising:
if the success rate of the training result is smaller than or equal to the success rate threshold, acquiring a successful pruning model with the latest storage time stamp from a locally stored successful pruning model set; the successful pruning model refers to the second model with the success rate of the training result being greater than the success rate threshold;
updating the pruning model based on the successful pruning model with the latest stored timestamp.
7. The method as recited in claim 6, further comprising:
and storing the second model with the success rate of the training result larger than the success rate threshold value in the successful pruning model set, and determining the second model as a successful pruning model with the latest storage time stamp in the successful pruning model set.
8. The method as recited in claim 4, further comprising:
obtaining model parameters of a sample model; the model parameters of the sample model and the model parameters of the original model have the same model topological structure;
randomly deleting channel parameters in model parameters of the sample model to obtain a sample pruning model corresponding to the sample model;
training the sample pruning model according to second sample data to obtain a sample training model;
acquiring an actual performance ratio between the sample training model and the sample model;
predicting a prediction performance ratio corresponding to the sample training model through an initial prediction model;
adjusting the initial prediction model according to the actual performance ratio and the predicted performance ratio;
and when the adjusted initial prediction model meets the convergence condition, determining the adjusted initial prediction model as a prediction model associated with the original model.
9. The method as recited in claim 8, further comprising:
determining a loss value of the adjusted initial prediction model according to the actual performance ratio, the prediction performance ratio output by the adjusted initial prediction model and the regularized loss value of the adjusted initial prediction model;
when the loss value is smaller than a loss function threshold value, obtaining a prediction result success rate of the adjusted initial prediction model;
if the success rate of the prediction result of the adjusted initial prediction model is greater than a prediction success rate threshold value, determining that the adjusted initial prediction model meets a convergence condition;
and if the success rate of the prediction result of the adjusted initial prediction model is smaller than or equal to the threshold value of the prediction success rate, regenerating a sample pruning model to reconstruct the initial prediction model.
10. The method of claim 9, wherein obtaining the predicted outcome success rate of the adjusted initial prediction model when the loss value is less than a loss function threshold comprises:
when the loss value is smaller than a loss threshold function, testing the adjusted initial prediction model based on a sample test model to obtain a sample test result corresponding to the sample test model; the sample test model is a model obtained by pruning the sample model;
Determining a prediction error corresponding to the adjusted initial prediction model based on the sample test result;
determining the sample test result with the prediction error smaller than an error threshold value as a successful sample test result;
and counting the success rate of the prediction results of the adjusted initial prediction model according to the total number of the sample test results and the number of the successful sample test results.
11. The method of claim 10, wherein determining a prediction error corresponding to the adjusted initial prediction model based on the sample test results comprises:
obtaining a test performance ratio in the sample test result;
determining the test model operation performance corresponding to the sample test model according to the test performance ratio in the sample test result and the original model operation performance;
acquiring the actual model running performance corresponding to the sample test model;
and determining a prediction error corresponding to the adjusted initial prediction model according to the test model running performance, the actual model running performance of the sample test model and the original model running performance.
12. A data processing apparatus, the apparatus comprising:
The pruning module is used for pruning the first model to obtain a pruning model; the business processing performed by the first model comprises image recognition;
the first training module is used for training the pruning model according to the image data contained in the first sample data to obtain a second model, and obtaining the success rate of training results of the second model; the training result success rate is determined based on a ratio between the number of successful matches and the total amount of image data contained in the first sample data; the successful matching means that the sample label of the image data is consistent with the training result of the image data; the sample tag of the image data is used for indicating the actual classification of the image data; the training result of the image data refers to the prediction classification obtained after the image data is input into the second model;
the first prediction module is used for predicting the model running performance corresponding to the second model according to the original model running performance and the model topological structure of the second model if the success rate of the training result is greater than a success rate threshold;
the pruning module is used for pruning the second model again if the running performance of the model corresponding to the second model does not reach the performance threshold;
And the first determining module is used for determining the second model as a target model for image recognition if the model running performance corresponding to the second model reaches the performance threshold.
13. A computer device, comprising: a processor, a memory, a network interface;
the processor is connected to a memory for providing data communication functions, a network interface for storing a computer program, and for invoking the computer program to perform the method according to any of claims 1-11.
14. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method of any of claims 1-11.
CN202010079087.1A 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium Active CN111310918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010079087.1A CN111310918B (en) 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010079087.1A CN111310918B (en) 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111310918A CN111310918A (en) 2020-06-19
CN111310918B true CN111310918B (en) 2023-07-14

Family

ID=71145490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010079087.1A Active CN111310918B (en) 2020-02-03 2020-02-03 Data processing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111310918B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553169B (en) * 2020-06-25 2023-08-25 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111539224B (en) * 2020-06-25 2023-08-25 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN114611705A (en) * 2020-11-23 2022-06-10 华为技术有限公司 Data processing method, training method for machine learning, and related device and equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368891A (en) * 2017-05-27 2017-11-21 深圳市深网视界科技有限公司 A kind of compression method and device of deep learning model
CN109460613A (en) * 2018-11-12 2019-03-12 北京迈格威科技有限公司 Model method of cutting out and device
CN110059823A (en) * 2019-04-28 2019-07-26 中国科学技术大学 Deep neural network model compression method and device
WO2019186194A2 (en) * 2018-03-29 2019-10-03 Benevolentai Technology Limited Ensemble model creation and selection
CN110414673A (en) * 2019-07-31 2019-11-05 北京达佳互联信息技术有限公司 Multimedia recognition methods, device, equipment and storage medium
CN110674939A (en) * 2019-08-31 2020-01-10 电子科技大学 Deep neural network model compression method based on pruning threshold automatic search

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11593632B2 (en) * 2016-12-15 2023-02-28 WaveOne Inc. Deep learning based on image encoding and decoding
WO2019082165A1 (en) * 2017-10-26 2019-05-02 Uber Technologies, Inc. Generating compressed representation neural networks having high degree of accuracy

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107368891A (en) * 2017-05-27 2017-11-21 深圳市深网视界科技有限公司 A kind of compression method and device of deep learning model
WO2019186194A2 (en) * 2018-03-29 2019-10-03 Benevolentai Technology Limited Ensemble model creation and selection
CN109460613A (en) * 2018-11-12 2019-03-12 北京迈格威科技有限公司 Model method of cutting out and device
CN110059823A (en) * 2019-04-28 2019-07-26 中国科学技术大学 Deep neural network model compression method and device
CN110414673A (en) * 2019-07-31 2019-11-05 北京达佳互联信息技术有限公司 Multimedia recognition methods, device, equipment and storage medium
CN110674939A (en) * 2019-08-31 2020-01-10 电子科技大学 Deep neural network model compression method based on pruning threshold automatic search

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Pruning from Scratch;Yulong Wang 等;《arXiv》;1-12 *
基于GoogLeNet模型的剪枝算法;彭冬亮 等;《控制与决策》;第34卷(第6期);1259-1264 *
基于剪枝技术的自适应PPM预测模型;曹仰杰 等;《计算机工程与应用》;141-144,148 *

Also Published As

Publication number Publication date
CN111310918A (en) 2020-06-19

Similar Documents

Publication Publication Date Title
CN110070117B (en) Data processing method and device
CN110688288B (en) Automatic test method, device, equipment and storage medium based on artificial intelligence
CN111310918B (en) Data processing method, device, computer equipment and storage medium
CN111352965B (en) Training method of sequence mining model, and processing method and equipment of sequence data
CN112052948B (en) Network model compression method and device, storage medium and electronic equipment
CN112418292B (en) Image quality evaluation method, device, computer equipment and storage medium
CN116049412B (en) Text classification method, model training method, device and electronic equipment
WO2023116111A1 (en) Disk fault prediction method and apparatus
CN116596095B (en) Training method and device of carbon emission prediction model based on machine learning
Dai et al. Hybrid deep model for human behavior understanding on industrial internet of video things
CN112817563B (en) Target attribute configuration information determining method, computer device, and storage medium
CN115221396A (en) Information recommendation method and device based on artificial intelligence and electronic equipment
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
CN110855474B (en) Network feature extraction method, device, equipment and storage medium of KQI data
CN113011893B (en) Data processing method, device, computer equipment and storage medium
CN115063374A (en) Model training method, face image quality scoring method, electronic device and storage medium
US20220164659A1 (en) Deep Learning Error Minimizing System for Real-Time Generation of Big Data Analysis Models for Mobile App Users and Controlling Method for the Same
Suresh et al. Uncertain data analysis with regularized XGBoost
CN113360772A (en) Interpretable recommendation model training method and device
CN112463964A (en) Text classification and model training method, device, equipment and storage medium
CN113688989B (en) Deep learning network acceleration method, device, equipment and storage medium
CN113793604B (en) Speech recognition system optimization method and device
US11997151B2 (en) Multimedia data processing method and apparatus, device, and readable storage medium
CN113743448B (en) Model training data acquisition method, model training method and device
CN111783711B (en) Skeleton behavior identification method and device based on body component layer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40024278

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant