CN110874550A - Data processing method, device, equipment and system - Google Patents
Data processing method, device, equipment and system Download PDFInfo
- Publication number
- CN110874550A CN110874550A CN201811016411.4A CN201811016411A CN110874550A CN 110874550 A CN110874550 A CN 110874550A CN 201811016411 A CN201811016411 A CN 201811016411A CN 110874550 A CN110874550 A CN 110874550A
- Authority
- CN
- China
- Prior art keywords
- neural network
- network model
- data
- neuron
- activation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title abstract description 11
- 238000003062 neural network model Methods 0.000 claims abstract description 339
- 210000002569 neuron Anatomy 0.000 claims abstract description 181
- 230000004913 activation Effects 0.000 claims abstract description 160
- 238000012549 training Methods 0.000 claims abstract description 123
- 238000012545 processing Methods 0.000 claims abstract description 92
- 238000013138 pruning Methods 0.000 claims abstract description 44
- 238000001994 activation Methods 0.000 claims description 159
- 238000000034 method Methods 0.000 claims description 55
- 230000006870 function Effects 0.000 claims description 39
- 230000015654 memory Effects 0.000 claims description 35
- 238000004364 calculation method Methods 0.000 abstract description 17
- 239000002699 waste material Substances 0.000 abstract description 5
- 238000004891 communication Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 10
- 238000001514 detection method Methods 0.000 description 6
- 238000007635 classification algorithm Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 238000012806 monitoring device Methods 0.000 description 4
- 244000141353 Prunus domestica Species 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000005284 excitation Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005265 energy consumption Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The application discloses a data processing method, which comprises the following steps: the edge device obtains an initial neural network model that includes at least one neuron. The edge device inputs the data to be processed into the trained neural network model to obtain result data; the result data is obtained by processing data to be processed by using neurons in the trained neural network model, the trained neural network model is obtained by training a pruning neural network model by using N groups of training samples, and the pruning neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of at least one neuron. Therefore, the accuracy of data processing is improved, the calculation amount is reduced, and the waste of calculation resources is avoided.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing method, apparatus, device, and system.
Background
With the development of deep learning technology, especially the popularization of convolutional neural networks, it is widely applied to fields such as image processing, face recognition, and the like. At present, a general neural network model is generally designed in the same application scenario for data processing in consideration of the generality of the model, the design cost and the like.
In a video monitoring scene, the actual monitoring environment is not considered, such as different installation heights, angles and the like of the camera. If the same neural network model is adopted to process the images collected by different cameras during data processing, the accuracy of image processing can be influenced. In addition, considering the universality of the neural network model, the neural network model has more designed neurons and more network layers, so that the waste of computing resources also exists in some specific scenes.
Disclosure of Invention
The application discloses a data processing method, a data processing device and a data processing system, which can provide a neural network model suitable for an edge device or a scene where the edge device is located at present so as to perform corresponding data processing, thereby improving the accuracy of data processing and avoiding the waste of computing resources.
In a first aspect, the present application discloses a data processing method, including: the edge device obtains the trained neural network model, and inputs the data to be processed into the trained neural network model to obtain result data of the data to be processed. The trained neural network model is obtained by training the pruning neural network model by using N groups of training samples. The pruning neural network model is obtained by pruning the initial neural network model according to the respective activation information of each neuron in the initial neural network model.
In consideration of the reasons that the model training calculation amount is large, the calculation resources of the edge device are limited, and the like, the training of the neural network model is generally performed on the data center side. In other words, the data center may perform pruning processing on the neurons in the initial neural network model according to the respective activation information of each neuron in the initial neural network model to obtain a pruned neural network model. Further, the data center trains the pruning neural network model using the N sets of training samples to obtain a trained neural network model. And then, the data center sends the trained neural network model to the edge device, so that the edge device can conveniently process data based on the trained neural network model.
The activation information of the neuron refers to relevant information generated by the neuron when the neuron in the initial neural network model is used for data processing. The activation information includes, but is not limited to, an activation value, a number of activations, and an average activation value, as described in more detail below.
By implementing the process, the neural network model suitable for the edge device (or the current scene of the edge device) can be provided, and then the neural network model is used for data processing, so that the accuracy of data processing can be improved. In addition, compared with the general neural network model in the prior art, the model scale can be reduced, the data processing rate is improved, and the waste of computing resources is avoided.
In one possible embodiment, each set of training samples includes input data and output data. The output data is obtained by processing the input data by using an initial neural network model. Specifically, the edge device obtains an initial neural network model from the data center. Then, a plurality of groups of input data of the self-equipment or the current scene of the self-equipment are collected, and the plurality of groups of input data are processed by utilizing the initial neural network model to obtain corresponding output data. Further, each set of input data and output data is taken as a set of training samples, so that a plurality of sets of training samples can be obtained. And sending the multiple groups of training samples to a data center, so that the data center can conveniently retrain the initial neural network model by using the multiple groups of training samples to obtain a trained neural network model suitable for the edge device (or the current scene of the edge device).
Alternatively, when data processing is performed using the initial neural network model, activation information of each neuron in the initial neural network model may be recorded each time, and the like.
By implementing the steps, the edge device can obtain a customized trained neural network model, and the trained neural network model can be suitable for a training sample of the edge device or a scene where the edge device is located, so that a more accurate result can be obtained. Moreover, the training sample does not need manual labeling, user operation can be reduced, and meanwhile the training precision of the neural network model is improved.
In one possible embodiment, the activation information comprises at least one of: an activation value, a number of activations, and an average activation value. The activation value is the output value of a neuron in the initial neural network model each time data processing is carried out by using the neuron. The number of activations is the number of times that the neuron in the initial neural network model is activated a number of times and the activation value of the neuron is less than or equal to a preset threshold (fourth threshold) in data processing. The average activation value is an average value of activation values of neurons in the initial neural network model in data processing performed multiple times. Correspondingly, the process of pruning the initial neural network model by the data center specifically includes at least one of the following: when the activation information includes an activation value, the data center deletes (prunes) neurons having an activation value less than or equal to a first threshold value in the initial neural network model according to the respective activation value of at least one neuron in the initial neural network model. And when the activation information comprises the activation times, the data center deletes the neurons with the activation times smaller than or equal to the second threshold value in the initial neural network model according to the respective activation times of at least one neuron in the initial neural network model. And when the activation information comprises the average activation value, the data center deletes the neurons with the average activation values smaller than or equal to the third threshold value in the initial neural network model according to the respective average activation values of at least one neuron in the initial neural network model.
By implementing the above process, the data center can prune and delete the neurons in the model itself according to the activation information of each neuron in the initial neural network model, so as to obtain the pruned neural network model. The training of the pruning neural network model is facilitated, and the trained neural network model which is suitable for edge equipment or the scene is obtained. Therefore, the model scale can be reduced, the waste of computing resources is reduced, and the data processing efficiency is improved.
In a possible implementation manner, the trained neural network model may be obtained by updating parameters in the pruning neural network model by using a loss function for the data center. The loss function is used to indicate a loss of error between training data, which is output before input data to a fully connected layer obtained in an initial neural network model, and prediction data. The prediction data is output before input data is input into a full connection layer obtained in the pruning neural network model. Specifically, the data center can utilize N groups of training samples to train the pruning neural network model for multiple times, the loss function is used to correct the parameters of the model in each training process, and the one with the minimum loss function value is selected as the parameter of the trained neural network model, so that the trained neural network model is obtained.
By implementing the process, the data center can obtain a trained neural network model with higher accuracy through training of the loss function, so that the subsequent edge equipment side can conveniently and directly utilize the trained neural network model to process data, and the accuracy of data processing is improved.
In a second aspect, the present application provides a data processing apparatus comprising functional modules or units for performing the method as described in the first aspect above or in any possible implementation of the first aspect.
In a third aspect, the present application provides an edge device (e.g., a smart camera, a roadside monitoring device, etc.) comprising a processor, a memory, a communication interface, and a bus; the processor, the communication interface and the memory are communicated with each other through a bus; a communication interface for receiving and transmitting data; a memory to store instructions; a processor for invoking instructions in a memory for performing the method described in the first aspect or any possible implementation manner of the first aspect.
In a fourth aspect, the present application provides a data processing system comprising a data center and edge devices; the data center is used for storing an initial neural network model and training the initial neural network model by using N groups of training samples to obtain a trained neural network model. The edge device comprises a processor, a memory, a communication interface and a bus; the processor, the communication interface and the memory are communicated with each other through a bus; a communication interface for receiving and transmitting data; a memory to store instructions; a processor for invoking instructions in a memory for performing the method described in the first aspect or any possible implementation manner of the first aspect.
In a fifth aspect, the present application provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to perform the method of the first aspect described above.
In a sixth aspect, the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the above aspects.
The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.
Drawings
Fig. 1 is a schematic structural diagram of a YOLO model according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a network framework of a data processing system according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating a data processing method according to an embodiment of the present invention.
Fig. 4A-4B are schematic structural diagrams of two network layers according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Fig. 6 is a schematic structural diagram of an edge device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail below with reference to the accompanying drawings of the present invention.
First, some technical terms related to the present invention are introduced.
The edge device refers to a device installed on the edge network side. For example, in a video monitoring scene, the edge device may be specifically a monitoring device or a smart camera installed on a road.
A neural network model refers to a complex network system formed by a large number of simple processing units (called neurons) interconnected. The neural network model is described on the basis of a mathematical model of a neuron, and has the capabilities of large-scale parallel, distributed storage and processing, self-adaptation, self-learning and the like. See below for a description of neurons. In practical application, the neural network model is composed of at least one network layer of the following network layers: convolutional layers, active layers, pooling layers, and fully-connected layers, etc., the number of deployments of each network layer is not limited in the present invention, and may be one or more. Each network layer is composed of one or more neurons, the parameters (weights) of which may also be referred to as model parameters of the neural network model. The neurons in each network layer may be interconnected, or the neurons in different network layers may be interconnected, as shown in detail in fig. 4A below.
Neurons, also called nodes. Each neuron represents a particular output function, also called the excitation function. When a neural network is used for data processing, it is essential to perform data processing using neurons in a neural network model, that is, to perform data processing using an excitation function of the neurons. It is understood that, because the neural network models are different, the excitation functions of the same neuron in different neural network models may also be different, and the present invention is not limited thereto.
Convolutional layers refer to network layers for feature extraction. In the convolutional layer, a convolution operation may be performed on input data to extract deep level feature data of the input data. The input data is used as an image, the image is input into a convolution layer, and the convolution layer is used for carrying out convolution operation on the image so as to obtain a characteristic image which is hidden by the image and is deep.
And the pooling layer refers to a network layer for data compression. And the activation layer is used for performing activation operation on input data, and essentially performing operation of a specific function so as to enhance the nonlinear expression capability of the data. The fully-connected layer is a network layer which plays a role of a classifier in the neural network model, and each neuron (node) in the fully-connected layer is connected with a neuron in the previous network layer so as to integrate feature data output by the neuron in the previous network layer and obtain a final output result of the neural network model.
In the present invention, the neural network model includes, but is not limited to, a convolutional neural network model, a cyclic neural network model, a deep neural network model, a feedforward neural network model, a deep belief network model, a generative confrontation network model, and other neural network models. For convenience of description, the following embodiments of the present invention are further described by taking the YOLO model in the convolutional neural network model as an example. The YOLO model is a network model for target detection in the convolutional neural network model, for example, a target object included in the detection image, such as a vehicle, a puppy, and the like. The relevant embodiments of the YOLO model are briefly described below.
Fig. 1 is a schematic diagram of a logic structure of a neural network model according to an embodiment of the present invention, where the neural network model may also be referred to as a YOLO model. As shown, the YOLO model includes 9 network layers, wherein the first 7 network layers (shown as layer 1-layer 7) are convolutional layers, and the last 2 network layers (shown as layer 8 and layer 9) are fully-connected layers. Wherein, the number of the neurons included in each network layer may be different.
The activation information refers to information related to the neuron activation, such as an activation value, an activation number, and an average activation value of the neuron. The neuron is activated by performing data operation (data calculation) using a neuron in the neural network model, and the neuron is also considered to be activated for use.
The activation value refers to an output value at which the neuron is activated. In other words, it refers to an output value of a neuron obtained when data operation is performed on input data using the neuron.
The activation times refer to the times when the output value of the neuron is greater than a preset threshold value when the neuron is activated for multiple times. In other words, it refers to the number of times that the activation value (output value) of the neuron is greater than a preset threshold value at each time of statistics when the neuron is used for data calculation for multiple times. The preset threshold is set by a user or a system in a self-defined mode. Specifically, the preset threshold may be obtained by a user or a system according to a series of experimental data statistics, or an empirical value set according to actual experience, and the like, which is not limited in the present invention.
The average activation value refers to an average value of output values of a neuron when the neuron is activated a plurality of times. In other words, it means an average value of activation values of a neuron obtained a plurality of times when the neuron is used for data calculation a plurality of times.
Illustratively, in an application scene of image recognition, the edge device is an intelligent camera. The edge device can periodically acquire a plurality of images in the current scene, and each image is input into the neural network model to be processed so as to obtain corresponding result data. The result data is used to indicate the classification to which the image belongs, e.g. forest, beach, starry sky. Wherein, each time an image is processed by using the neural network model, the image is essentially processed by using the neurons in the neural network model. Accordingly, in multi-image processing, the edge device will process the multiple images using the neurons in the neural network model multiple times. In each image processing process, the edge device records activation information of the neuron, such as an activation value, activation times, an average activation value, and the like, so as to facilitate subsequent updating of the neural network model by using the activation information of the neuron, which is described in detail below.
A loss function, also referred to as a cost function. Refers to an optimization function of a neural network model, which is used for measuring the error degree of prediction. The essence of the training process of the neural network model is to find the minimum value of the loss function, i.e., the process in which the difference between the predicted result and the actual result (also called the loss value) is minimum or the predicted result and the actual result are closest. Understandably, in the actual neural network model training process, the data center obtains the model parameter with the minimum loss function value for one time through training the model parameter in the neural network model for multiple times, and the model parameter is used as the model parameter in the trained neural network model, so that the trained neural network model is obtained.
For example, taking a neural network model as an image recognition model as an example, assume that a training sample for training the image recognition model is a plurality of images, each image is composed of one or more pixel points, and each pixel point corresponds to a respective pixel value (i.e., input real data). Each image comprises a target object, and the pixel value of the image area where the target object is located is different from the pixel values of other areas in the image. In other words, the target object may be distinguished or identified by the pixel values of the image area in which the target object is located.
Correspondingly, the data center inputs a plurality of images into the image identification model, and can perform reasoning calculation on the pixel point in each image so as to obtain the prediction data of the pixel point. Furthermore, the data center compares the predicted data of the pixel points in the multiple images with the real data of the pixel points by using a preset loss function so as to correct the model parameters in the image recognition model. Repeating the above operations, the data center can find out the time when the difference between the predicted data and the real data of the pixel points in the multiple images is minimum, and the model parameters of the trained (corrected) image recognition model are used as the model parameters of the trained image recognition model, so that the trained image recognition model is obtained.
Fig. 2 is a schematic diagram of a network framework of a data processing system according to an embodiment of the present invention. As shown, the data processing system 100 includes a data center 102 and edge devices 104. The data center 102, also referred to as a cloud, includes a plurality of servers in the data center 102, and the trained neural network model (specifically, an initial neural network model or a trained neural network model) is deployed in the data center 102 for downloading and using by the edge device 104.
In practical applications, considering the reasons that the energy consumption requirement of the edge device is high and the computing power is low due to the data processing process, the training of the neural network model is usually completed in the data center 102.
The training process of the neural network model is briefly described below. Specifically, the data center obtains an initial neural network model (a neural network model to be trained or optimized) and a training sample set. Further, the initial neural network model is trained multiple times using the training sample set. And in the multiple training processes, selecting one time with the minimum loss function value to obtain the model parameters of the neural network model generated by the training, wherein the model parameters are used as the model parameters of the trained neural network model, so that the trained neural network model is obtained. The training of the neural network model is described in detail below.
Wherein, the initial neural network model can be determined according to the actual application scene. Illustratively, taking the image recognition scenario as an example, the initial neural network model may be a convolutional neural network model, such as the YOLO model above. Taking a speech recognition scenario as an example, the initial neural network model may be a deep neural network model, which may include one or more network layers, and neurons in two adjacent network layers may be connected to each other, and the invention is not limited thereto.
The training sample set comprises N groups of training samples, and each group of training samples is used for training the initial neural network model to obtain the trained neural network model. Each training sample group comprises input data and output data, and the input data and the output data correspond to each other one by one. N is a positive integer. In different application scenarios, the training samples (i.e., input data and output data) may not be identical.
For example, in an image recognition scenario, assume that an edge device wants to recognize the sun including different periods in the image using a neural network model. Correspondingly, when the training sample set is selected, the input data includes sun images at different time intervals in a day, and the output data includes the position of the sun in each sun image or other characteristic information for identifying the time interval of the sun in the sun image, so that the neural network model obtained by training can accurately identify the time interval classification of the sun in the image.
Accordingly, after the data center obtains the training sample set (i.e., a plurality of sets of training samples including input data and output data), the initial neural network model may be trained by using a plurality of solar images in the training sample set and a real time period of the sun in each solar image, so as to obtain a trained neural network model. Specifically, the data center may input each solar image into an initial neural network model, and process a pixel point in each solar image using the neural network model, so as to predict and obtain a predicted time period in which the sun is located in the solar image according to a position of the sun in the solar image or other feature information identifying the time period in which the sun is located. Further, the data center calculates a difference value between a real time period and a prediction time period of the sun in the sun image by using a preset loss function. Repeating the operation process for many times, the data center can determine the model parameter of the neural network model with the smallest difference value for one time, and the model parameter is used as the model parameter of the trained neural network model, so that the trained neural network model is obtained.
In the invention, the data center utilizes the training samples comprising input data and output data to train the neural network model, and specifically, the initial neural network model is trained in an unsupervised and intelligent training mode without human participation. Compared with the prior art, the training of the neural network model is realized by utilizing the artificially labeled input data, and the error caused by artificial participation can be avoided, so that the accuracy of the training of the neural network model is improved, the artificial participation is not needed, and the convenience of the training of the neural network model can be improved.
For example, in an image recognition application scenario, in a neural network model training process in the conventional technology, an image with a label is usually selected as a training sample, for example, the image is labeled with an object included in the image, such as a puppy, a person, a vehicle, and the like. And the initial neural network model is conveniently trained by directly utilizing the image with the label subsequently.
In the present invention, however, the data center chooses to use the image without annotation as a training sample for training the initial neural network model. In particular, the training samples include input data (here, images) and output data (here, objects included in the images). The output data is input data of an image serving as an initial neural network model, and pixels in the image are calculated by using the initial neural network model, so as to predict and obtain an object included in the image according to feature information (such as contour features of the object, an identifier of the object, and the like) of the object included in the image. Therefore, the training of the neural network model is carried out by adopting the training sample without artificial marking, so that the convenience of training the neural network model can be improved; meanwhile, errors caused by artificial labeling are avoided, and the accuracy of the neural network model can be improved.
The edge device 104 may obtain an initial neural network model or a trained neural network model from the data center 102, so as to perform corresponding data processing by using the obtained neural network model. In different application scenarios, the actual data processing performed by using the acquired neural network model is also different.
In an image recognition scenario, the input data is an image including feature information of the vehicle to be recognized, such as an identification of the vehicle, a contour of the vehicle, and the like, for identifying features of the vehicle. Accordingly, the edge device may input the image into the trained neural network model, and obtain the vehicle included in the image according to the feature information of the vehicle in the image. As another example, in a speech recognition scenario, the input data is speech to be recognized. Accordingly, the edge device may input the speech to be recognized into the trained neural network model, and recognize and process the characteristic information, such as frequency or wavelength, included in the speech to be recognized, so as to obtain the text information and the like corresponding to the speech to be recognized, which is not limited in the present invention.
Alternatively, the above training sample set (N training samples) may be used for the data center 102 to train the initial neural network model by using the training sample set after the edge device 104 collects data and sends the collected data to the data center 102. Specifically, the edge device 104 may first obtain an initial neural network model from the data center 102. Then, the edge device can acquire N groups of input data through self device or other edge devices, and input each group of input data into the initial neural network model to acquire corresponding output data. Based on this principle, the edge device can obtain a training sample set including N sets of training samples, where each set of training samples includes input data and output data, and details are not repeated here.
It is understood that in different application scenarios, the type of the input data or the output data may be different, and may include, but is not limited to, image data, voice data, text data, and the like. Specifically, when the edge device has data collection capability, the edge device can collect corresponding input data through the device itself. The image processing scene is taken as the situation that the edge device is provided with a camera, and the edge device can acquire one or more images of the current scene through the camera to be used as input data. On the contrary, when the edge device does not have the data acquisition capability, the edge device can acquire corresponding input data through other edge devices with the data acquisition capability. Further, the edge device obtains the corresponding training sample according to the obtained input data, which may specifically refer to the related explanation in the foregoing embodiments, and details are not described here.
In the present invention, the number of the edge devices 104 is not limited, and may be one or more, where n is illustrated as an example, and n is a positive integer. The edge device may be, for example, an intelligent camera, an intelligent mobile phone, a tablet computer, a palm computer, a notebook computer, or the like, and the present invention is not limited thereto.
The following illustrates a specific implementation process of the edge device to obtain the training sample set. Taking an image classification application scene as an example, the edge device is a monitoring device. Accordingly, the edge device may capture multiple images of the current scene as input data to the neural network model. Further, the edge device may obtain an initial neural network model from the data center, and process a plurality of input images as input data by using the initial neural network model to obtain a classification to which each input image belongs, for example, the input images are a character image, a starry sky image, a sea view image, a beach image, and the like. Accordingly, the edge device may treat each input image and the class to which the input image belongs as a set of training samples, thereby obtaining a training sample set including a plurality of sets of training samples. Optionally, the edge device may send the training sample set to the data center, so that the data center can retrain the initial neural network model by using the training sample set, which is not described herein again.
Next, a related embodiment of a data processing method to which the present invention relates will be described. Fig. 3 is a schematic flow chart of a data processing method according to an embodiment of the present invention. The method as shown in fig. 3 comprises the following implementation steps:
step S301, the edge device obtains an initial neural network model from the data center.
In the invention, the initial neural network model may also be referred to as a full-scale model, which may be a general model obtained by training the data center with a training sample set in advance, and the general model is suitable for all or part of application scenarios. For example, the YOLO model commonly used in the field of object detection.
Step S302, the edge device acquires N groups of input data, and inputs the N groups of input data into the initial neural network model respectively to obtain output data of the N groups of input data and respective activation information of each neuron in the initial neural network model.
The edge device can collect corresponding input data when performing data processing (reasoning calculation) by using the initial neural network model. In different application scenarios, the input data may be different, for example, it may be image data, text data, voice data, and the like, and specific reference may be made to relevant descriptions in the foregoing embodiments, which are not described herein again. Further, the edge device may input the input data into an initial neural network model for calculation, and obtain corresponding output data. Based on the principle, the edge device performs N times of reasoning calculation by using an initial neural network model. That is, the edge device may collect N sets of input data, and process the N sets of input data using the initial neural network model N times to obtain output data corresponding to the N sets of input data.
Optionally, the edge device takes a set of input data and output data obtained at a time as a set of training samples. When the initial neural network model is calculated by using N sets of input data, the edge device may obtain N sets of training samples, each set of training samples including input data and output data, the output data being data obtained by processing the input data by using the initial neural network model. The initial neural network model is used by the edge device every time one output data is obtained.
Optionally, the edge device essentially computes a set of input data using neurons in the initial neural network model each time the input data is processed using the initial neural network model. In the calculation process, the edge device can also record activation information generated when each neuron in the initial neural network model performs calculation (is activated). In other words, the edge device may record the respective activation information of each neuron each time a data calculation is performed using each neuron in the initial neural network model. The activation information includes, but is not limited to, any one or combination of more of the following: an activation value, a number of activations, and an average activation value. For the description of the activation information, reference may be made to the foregoing embodiments, which are not described in detail herein.
Specifically, each time the edge device calculates the input data by using the neurons in the initial neural network model, the edge device may record the respective activation values of each neuron in the initial neural network model. When the edge device performs data processing using the neurons in the initial neural network model multiple times (e.g., M times), the edge device may count and record information such as the respective activation times and the average activation value of each neuron in the initial neural network model. The number of activation times may be a number of times, which is greater than or equal to a preset threshold value, of M activation values of a neuron when the neuron in the initial neural network model is used for data processing M times, and the number of activation times of the neuron is taken as the number of activation times of the neuron. M is a positive integer less than or equal to N. The average activation value may be an average of M activation values of a neuron when the neuron in the initial neural network model is used for data processing M times, and the average activation value of the neuron is used as the average activation value of the neuron.
Step S303, the edge device sends the N sets of training samples and the activation information of each neuron in the initial neural network model to the data center.
Accordingly, the data center receives N sets of training samples and the respective activation information for each neuron. The training sample includes input data and output data, where the output data is data obtained by calculating the input data as an input of the initial neural network model, and for the input data and the output data, reference may be made to the relevant explanations in the foregoing embodiments, and details are not described here.
The edge device sends the N groups of training samples and the respective activation information of each neuron to the data center, so that the data center can conveniently retrain the initial neural network model by using the information to obtain the neural network model adaptive to the edge device (or the deployment scene of the edge device). Therefore, the neural network models belonging to different edge devices are trained conveniently, the actual requirements of different edge devices are met, and the practicability of model processing is improved. In other words, the invention can retrain the personalized neural network model according to the edge device or the deployment scene of the edge device, so as to meet the real-time requirement of the edge device.
And S304, the data center performs pruning processing on the neurons in the initial neural network model according to the respective activation information of each neuron to obtain a pruning neural network model.
In the invention, the data center deletes the neurons meeting any one or more of the following conditions according to the respective activation information of each neuron in the initial neural network model so as to obtain a corresponding pruning neural network model. The conditions specifically include:
1) the activation value of the neuron is less than or equal to a first threshold;
2) the number of activations of neurons is less than or equal to a second threshold;
3) the average activation value of the neurons is less than or equal to a third threshold value.
Therefore, neurons which are not suitable for the edge device or the deployment scene of the edge device can be cut off conveniently, so that the model scale is reduced, the calculation amount is reduced, and the calculation resources are saved.
The first threshold, the second threshold, and the third threshold may be specifically set by a user or a system, and they may be the same or different, and the present invention is not limited thereto. For example, if the system wants to obtain a pruning neural network model with higher computational accuracy, the three thresholds can be set to be larger, for example, 5, etc. On the contrary, the system wants to obtain a pruning neural network model with lower calculation accuracy, and the three thresholds can be set to be smaller, for example, 0.01, etc. Optionally, the three thresholds may be obtained by a system according to a series of statistical data statistics, or empirical values set by a user according to actual experience, and the like, which is not limited in the present invention.
For example, fig. 4A shows a schematic structural diagram of two network layers in the YOLO model. As shown in FIG. 4A, the Nth network layer (referred to as Nth layer, specifically, any one of layers 1-8 in the YOLO model in FIG. 1) includes 6 neurons, each of which is On1,On2,…On6. The N +1 th layer includes 4 neurons, O respectively(n+1)1,O(n+1)2,…O(n+1)4. The neurons of the two adjacent layers adopt a full connection mode, namely, each neuron of the Nth layer is connected with all the neurons of the (N + 1) th layer. In the pruning process, respective activation values of 6 neurons in the Nth layer are assumed to be 0, 1, 0, 0, 1 and 1; the activation values of the 4 neurons in the N +1 th layer are 1, 0, 1 and 1 respectively. Accordingly, the data center performs pruning deletion on the neurons with activation values less than or equal to 0, so as to obtain the pruning neural network model shown in fig. 4B. That is, the data center will have neuron O in layer Nn1,On3And On4Deleting the neuron O in the N +1 th layer(n+1)2DeletingThereby obtaining a pruning neural network model including two network layers as shown in fig. 4B.
And S305, training the pruning neural network model by the data center according to the N groups of training samples to obtain the trained neural network model.
In the invention, the data center updates the model parameters in the pruning neural network model by using the loss function in the training process of the pruning neural network model so as to obtain the trained neural network model. Wherein the loss function is used to indicate a loss of error between training data obtained by inputting the input data into the first neural network model and prediction data obtained by inputting the input data into the second neural network model.
The first neural network model is a network model except a preset classification algorithm in the initial neural network model. The second neural network model is a network model except for a preset classification algorithm in the pruning neural network model. The preset classification algorithm is an algorithm or a rule in the model for calculating an output result, such as an image classification rule softmax and the like.
In practical applications, the predetermined classification algorithm is usually designed in the fully connected layer of the model. When the preset classification algorithm is designed in the full connection layer, the training data may be input data into an initial neural network model to obtain data output before the full connection layer. The prediction data may be specifically data output before input data is input to a full connection layer obtained in the pruning neural network model.
For example, referring to the YOLO model shown in fig. 1, the initial neural network model is a first YOLO model, the pruned neural network model is a second YOLO model, and the first and second YOLO models each include 7 convolutional layers and 2 fully-connected layers, but the neurons included in each network layer may be different. Accordingly, the training data in this example may be the data output from the last convolutional layer (i.e., the seventh convolutional layer, or a network layer before the fully-connected layer) obtained by inputting the input data into the first YOLO model. The predicted data may be output data of the seventh convolutional layer (i.e., the last convolutional layer) obtained by inputting the input data into the second YOLO model.
Understandably, in order to ensure the accuracy of model training, the data center can utilize N groups of training samples to train the pruning neural network model for multiple times so as to obtain a trained neural network model with higher accuracy. In the model training process, the data center corrects the model parameters in the model by using a preset loss function to obtain a trained neural network model. The essence of the model training process is that the data center continuously calculates the value of the loss function, and the model parameter corresponding to the time with the minimum value of the loss function is selected as the model parameter of the trained neural network model, so that the trained neural network model is obtained. It will be appreciated that the specific expression of the loss function may vary from neural network model to neural network model, and the present invention will be described below in an example.
Step S306, the data center updates the initial neural network model into a trained neural network model.
And step S307, the edge device acquires the trained neural network model from the data center.
Step S308, the edge device acquires data to be processed, inputs the data to be processed into the trained neural network model, and acquires result data corresponding to the data to be processed.
The data center may store the trained neural network model. Optionally, the data center may replace the initial neural network model with the trained neural network model, that is, the initial neural network model is updated to the trained neural network model for downloading and use by the edge device.
Optionally, the data center may train and obtain a trained neural network model suitable for the own device or the scene where the own device is located for each edge device according to the training principle of the neural network model. After the data center trains the trained neural network model suitable for each of the at least one edge device for the at least one edge device, the data center may store the trained neural network model in association with the edge device (specifically, may be an identifier of the edge device), so as to identify which edge device the trained neural network model is personalized, or which edge device the trained neural network model is suitable for.
Accordingly, the edge device can obtain the trained neural network model corresponding to the edge device from the data center according to actual requirements. Specifically, the edge device may send an acquisition request to the data center, where the acquisition request carries an identifier (e.g., a device name, a device ID number, etc.) of the edge device, and the acquisition request is used to request to acquire a trained neural network model adapted to the edge device. After receiving the acquisition request, the data center queries the trained neural network model corresponding to the identifier of the edge device according to the identifier of the edge device in the request, and sends the trained neural network model to the edge device.
Accordingly, the edge device side can receive the trained neural network model sent by the data center, and accordingly data processing can be conveniently carried out subsequently based on the trained neural network model. For example, the edge device may obtain data to be processed through its own device or other devices, and input the data to be processed into the trained neural network model for processing, so as to obtain corresponding result data, where the result data is used to indicate a result corresponding to the data to be processed. In different application scenarios, the data to be processed and the result data are different.
Illustratively, in the image classification scene, the data to be processed is an image to be classified, and the image is composed of at least one pixel point. Correspondingly, the edge device can input the image to be classified into the trained neural network model, and each neuron in the trained neural network model is used for calculating each pixel point in the image to be classified so as to obtain result data corresponding to the image to be classified. The result data is used to indicate the classification to which the image to be classified belongs, and may be, for example, a human image, a starry sky image, a beach image, a forest image, and the like.
As another example, in a speech recognition scenario, the data to be processed may be speech to be recognized. Correspondingly, the edge device can input the speech to be recognized into the trained neural network model, so that each neuron in the trained neural network model is used for calculating the speech to be recognized, and the result data corresponding to the speech to be recognized is obtained. The result data may be text information corresponding to the speech to be recognized. In other words, the trained neural network model can be used to realize the translation conversion from speech to text.
For the convenience of understanding the technical content of the present invention, the following description will be made in detail with reference to the YOLO model for vehicle detection.
Firstly, the data center can collect monitoring images of different traffic intersections, and an initial YOLO model is obtained by utilizing the monitoring images for training. The edge device can obtain the initial YOLO model from the data center according to actual requirements, and the model can be conveniently retrained subsequently by combining the edge device with the edge device or the scene where the edge device is located. Further, the edge device may capture an image of a vehicle in a preset time period (for example, one month, etc.) in the current scene, and input the image of the vehicle into the initial YOLO model for processing, so as to identify characteristic information (for example, a vehicle identifier, a license plate number, a vehicle contour, etc.) of the vehicle in the image of the vehicle. And the edge device obtains the target vehicle included in the vehicle image according to the characteristic information of the vehicle in the vehicle image. At the same time, the activation value of each neuron in the initial YOLO model can also be recorded. The edge device also sends the vehicle image, the target vehicle included in the vehicle image, and the activation value of each neuron in the initial YOLO model to the data center.
Accordingly, the data center deletes (prunes) the neurons with an activation value of 0 in the initial YOLO model according to the activation value of each neuron, so as to obtain a corresponding pruned neural network model. Further, the data center takes the vehicle image and the target vehicle included in the vehicle image as training samples, and trains the pruning neural network model again by using multiple groups of training samples. Specifically, the data center may adjust model parameters of the pruned neural network model using the loss function of the following formula (1) to obtain a trained YOLO model.
Where loss is the loss function. And i is a pixel point i included in the vehicle image. t is the total number of pixel points forming the vehicle image, in other words, t pixel points are included in the vehicle image. PiThe method is used for processing data output after pixel points i are processed by using a pruning neural network model with a full connection layer (specifically, classification rule softmax deployed in the full connection layer) removed. FiThe data output after processing the pixel point i by using the initial YOLO model of removing the full connection layer (specifically, the classification rule softmax deployed in the full connection layer) is shown.
And (3) training and adjusting model parameters in the pruning neural network model by the data center according to the loss function shown in the formula (1). Optionally, to ensure the model accuracy, a set of model parameters with the smallest loss function value is searched in the training process to serve as the model parameters of the trained YOLO model, so as to obtain the trained YOLO model. The invention is not limited and detailed herein. Optionally, the data center may store the identifier of the edge device and the trained YOLO model in an associated manner, so that the edge device side can subsequently obtain the trained YOLO model from the data center according to actual requirements.
Accordingly, the edge device can obtain the trained YOLO model from the data center, so that the trained YOLO model can be conveniently used for vehicle detection subsequently. For example, assume that the edge device is a monitoring device deployed on a road, and the edge device may acquire and obtain a to-be-processed image including a to-be-detected vehicle. Correspondingly, the edge device takes the image to be processed as the input of the trained YOLO model, calculates each pixel point of the image to be processed by using the neuron of each network layer in the trained YOLO model, identifies the characteristic information (such as vehicle identification, license plate number, vehicle outline and the like) of the vehicle to be detected in the image to be processed, and further learns the vehicle to be detected, so that the vehicle detection can be realized conveniently and efficiently.
By implementing the embodiment of the invention, different trained neural network models can be trained for different edge devices or deployment scenes of the edge devices to form customized neural network models, and the edge devices can analyze and process data based on the customized neural network models, so that the precision and the processing efficiency can be improved. In addition, in the embodiment of the invention, the collected data does not need to be marked manually, and the neural network model of the data center analyzes and processes the collected data according to the sample data sent by each edge device to obtain the processed data, thereby further improving the processing efficiency of each edge device. Because each edge device can process data according to the customized neural network model, the processing result can be obtained in a short time, and the time consumption of data processing is reduced.
The foregoing describes in detail a related embodiment of the data processing method provided in the embodiment of the present invention with reference to fig. 1 to 3. The following describes a data processing apparatus, a device, and a system according to an embodiment of the present invention with reference to fig. 5 to 6.
Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention. The data processing apparatus 500 shown in fig. 5, applied to the edge device side, may include an obtaining module 501 and a processing module 502; wherein,
the obtaining module 501 is configured to obtain an initial neural network model, where the initial neural network model includes at least one neuron, and the neuron is configured to process input data of the initial neural network model to obtain activation information of the neuron;
the processing module 502 is configured to input data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
In one possible embodiment, the training samples include input data and output data, wherein the output data is obtained by calculating the input data using the initial neural network model.
In one possible embodiment, the activation information comprises at least one of: an activation value, a number of activations, and an average activation value; the pruning of the at least one neuron in the initial neural network model according to the respective activation information of the at least one neuron includes: when the activation information comprises an activation value, deleting the neurons corresponding to the activation values smaller than or equal to a first threshold value in the initial neural network model according to the respective activation values of the at least one neuron; when the activation information comprises activation times, deleting the neurons corresponding to the activation times smaller than or equal to a second threshold value in the initial neural network model according to the respective activation times of the at least one neuron; when the activation information comprises an average activation value, deleting the neurons corresponding to the average activation value smaller than or equal to a third threshold value in the initial neural network model according to the respective average activation value of the at least one neuron; the activation value is an output value of a neuron in the initial neural network model when the neuron performs data processing each time, the activation times are times when the activation value of the neuron in the initial neural network model is greater than or equal to a fourth threshold in the data processing performed by using the neuron in the initial neural network model for M times, the average activation value is an average value of the activation values of the neuron in the data processing performed by using the neuron in the initial neural network model for M times, and M is a positive integer less than or equal to N.
In a possible implementation manner, the trained neural network model is obtained by updating parameters in the pruning neural network model by using a loss function; the loss function is used for indicating error loss between training data and prediction data, the training data is data output before a full-connection layer obtained by inputting input data in the training sample into the initial neural network model, and the prediction data is data output before the full-connection layer obtained by inputting input data in the training sample into the pruning neural network model.
In a possible implementation, the obtaining module 501 is specifically configured to obtain an initial neural network model from a data center.
It should be understood that the apparatus 500 of the embodiment of the present invention may be implemented by an application-specific integrated circuit (ASIC), or a Programmable Logic Device (PLD), which may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof. When the data processing method shown in fig. 3 can also be implemented by software, the apparatus and its respective modules may also be software modules.
The data processing apparatus 500 provided in the embodiment of the present invention may be correspondingly applied to execute the method provided in the embodiment of the present invention, and the functions of each module and/or other operations executed in the apparatus 500 are respectively for executing the flow steps of the method corresponding to fig. 3, and are not described herein again for brevity.
By implementing the embodiment of the invention, different trained neural network models can be designed for different edge devices or deployment scenes of the edge devices. The data processing is conveniently carried out by subsequently utilizing the neural network model suitable for the edge device or the deployment scene of the edge device, and the accuracy of the data processing is improved.
Fig. 6 is a schematic structural diagram of an edge device according to an embodiment of the present invention. The edge device 600 shown in fig. 6 may include one or more processors 601, communication interfaces 602, and memories 603, and the processors 601, the communication interfaces 602, and the memories 603 may be connected by a bus, and may also implement communication by other means such as wireless transmission. The embodiment of the present invention is exemplified by being connected through a bus 604, wherein the memory 603 is used for storing instructions, and the processor 601 is used for executing the instructions stored by the memory 503. The memory 603 stores program code, and the processor 601 may call the program code stored in the memory 603 to perform the following operations:
acquiring an initial neural network model, wherein the initial neural network model comprises at least one neuron, and the neuron is used for processing input data of the initial neural network model to obtain activation information of the neuron;
inputting data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
Optionally, in this embodiment of the present invention, the processor 601 may call the program code stored in the memory 603 to perform all or part of the steps described in the embodiment of the method illustrated in fig. 3, and/or other contents described in the text, and so on, which are not described herein again.
It should be appreciated that processor 601 may be comprised of one or more general-purpose processors, such as a Central Processing Unit (CPU). The processor 601 may be used to run the programs of the following functional modules in the related program code. The functional module may specifically include, but is not limited to, the above-described functional modules such as the acquisition module and/or the processing module. That is, the processor 601 executes the functions of any one or more of the functional modules described above. For each functional module mentioned herein, reference may be made to the relevant explanations in the foregoing embodiments, and details are not described here.
The communication interface 602 may be a wired interface (e.g., an ethernet interface) or a wireless interface (e.g., a cellular network interface or using a wireless local area network interface) for communicating with other modules/devices. For example, in the embodiment of the present application, the communication interface 602 may be specifically configured to receive an initial neural network model or a trained neural network model sent by a data center.
The Memory 603 may include a Volatile Memory (Volatile Memory), such as a Random Access Memory (RAM); the Memory may also include a Non-volatile Memory (Non-volatile Memory), such as a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, HDD), or a Solid-State Drive (SSD); the memory 603 may also comprise a combination of memories of the kind described above. The memory 603 may be used to store a set of program codes for the processor 601 to call the program codes stored in the memory 603 to implement the functions of the above-mentioned functional modules involved in the embodiments of the present invention.
It should be understood that the edge device 600 according to the embodiment of the present invention may correspond to the data processing apparatus 500 shown in fig. 5 in the embodiment of the present invention, and may correspond to an operation step for executing the main body on the edge device side in the method shown in fig. 3 in the embodiment of the present invention, and the above-mentioned step and other operations and/or functions of each module in the edge device are respectively for implementing corresponding flows of each method in fig. 3, and are not described herein again for brevity.
It should be noted that fig. 6 is only one possible implementation manner of the embodiment of the present invention, and in practical applications, the edge device may further include more or less components, which is not limited herein. For the content that is not shown or not described in the embodiment of the present invention, reference may be made to the related explanation in the embodiment described in fig. 1 to fig. 3, and details are not described here.
By implementing the embodiment of the invention, different trained neural network models can be designed for different edge devices or deployment scenes of the edge devices. The data processing is conveniently carried out by subsequently utilizing the neural network model suitable for the edge device or the deployment scene of the edge device, and the accuracy of the data processing is improved.
Embodiments of the present invention further provide a data processing system, which includes the data center 102 and the edge device 104 shown in fig. 2. Wherein, an initial neural network model or a trained neural network model is deployed in the data center 102. The edge device comprises a processor, a memory, a communication interface and a bus; the processor, the communication interface and the memory are communicated with each other through a bus; a communication interface for receiving and transmitting data; a memory to store instructions; the processor is configured to call the instruction in the memory, and perform all or part of the implementation steps described in the embodiment of the method illustrated in fig. 3, which is not described herein again.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded or executed on a computer, cause the flow or functions according to embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a Solid State Drive (SSD).
The foregoing is only illustrative of the present invention. Those skilled in the art can conceive of changes or substitutions based on the specific embodiments provided by the present invention, and all such changes or substitutions are intended to be included within the scope of the present invention.
Claims (12)
1. A method of data processing, the method comprising:
the method comprises the steps that an edge device obtains an initial neural network model, wherein the initial neural network model comprises at least one neuron, and the neuron is used for processing input data of the initial neural network model to obtain activation information of the neuron;
the edge device inputs data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
2. The method of claim 1, wherein the training samples comprise input data and output data, wherein the output data is computed from the input data using the initial neural network model.
3. The method according to claim 1 or 2, characterized in that the activation information comprises at least one of the following: an activation value, a number of activations, and an average activation value;
the pruning of the at least one neuron in the initial neural network model according to the respective activation information of the at least one neuron includes:
when the activation information comprises an activation value, deleting the neurons corresponding to the activation values smaller than or equal to a first threshold value in the initial neural network model according to the respective activation values of the at least one neuron;
when the activation information comprises activation times, deleting the neurons corresponding to the activation times smaller than or equal to a second threshold value in the initial neural network model according to the respective activation times of the at least one neuron;
when the activation information comprises an average activation value, deleting the neurons corresponding to the average activation value smaller than or equal to a third threshold value in the initial neural network model according to the respective average activation value of the at least one neuron;
the activation value is an output value of a neuron in the initial neural network model when the neuron performs data processing each time, the activation times are times when the activation value of the neuron in the initial neural network model is greater than or equal to a fourth threshold in the data processing performed by using the neuron in the initial neural network model for M times, the average activation value is an average value of the activation values of the neuron in the data processing performed by using the neuron in the initial neural network model for M times, and M is a positive integer less than or equal to N.
4. The method according to any one of claims 1 to 3, wherein the trained neural network model is obtained by training a pruning neural network model using N sets of training samples, and comprises:
the trained neural network model is obtained by updating parameters in the pruning neural network model by using a loss function based on N groups of training samples;
the loss function is used for indicating error loss between training data and prediction data, the training data is data output before a full-connection layer obtained by inputting input data in the training sample into the initial neural network model, and the prediction data is data output before the full-connection layer obtained by inputting input data in the training sample into the pruning neural network model.
5. The method of claims 1-4, wherein the edge device obtaining the initial neural network model comprises:
the edge device obtains an initial neural network model from the data center.
6. An edge device, characterized in that the edge device comprises an acquisition module and a processing module; wherein,
the obtaining module is configured to obtain an initial neural network model, where the initial neural network model includes at least one neuron, and the neuron is configured to process input data of the initial neural network model to obtain activation information of the neuron;
the processing module is used for inputting data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
7. The edge device of claim 6, wherein the training samples comprise input data and output data, wherein the output data is computed from the input data using the initial neural network model.
8. The edge device of claim 6 or 7, wherein the activation information comprises at least one of: an activation value, a number of activations, and an average activation value;
the pruning of the at least one neuron in the initial neural network model according to the respective activation information of the at least one neuron includes:
when the activation information comprises an activation value, deleting the neurons corresponding to the activation values smaller than or equal to a first threshold value in the initial neural network model according to the respective activation values of the at least one neuron;
when the activation information comprises activation times, deleting the neurons corresponding to the activation times smaller than or equal to a second threshold value in the initial neural network model according to the respective activation times of the at least one neuron;
when the activation information comprises an average activation value, deleting the neurons corresponding to the average activation value smaller than or equal to a third threshold value in the initial neural network model according to the respective average activation value of the at least one neuron;
the activation value is an output value of a neuron in the initial neural network model when the neuron performs data processing each time, the activation times are times when the activation value of the neuron in the initial neural network model is greater than or equal to a fourth threshold in the data processing performed by using the neuron in the initial neural network model for M times, the average activation value is an average value of the activation values of the neuron in the data processing performed by using the neuron in the initial neural network model for M times, and M is a positive integer less than or equal to N.
9. The edge device of any of claims 6-8, wherein the trained neural network model is further obtained by updating parameters in the pruned neural network model using a loss function;
the loss function is used for indicating error loss between training data and prediction data, the training data is data output before a full-connection layer obtained by inputting input data in the training sample into the initial neural network model, and the prediction data is data output before the full-connection layer obtained by inputting input data in the training sample into the pruning neural network model.
10. The edge device according to any of claims 6 to 9,
the obtaining module is specifically configured to obtain an initial neural network model from a data center.
11. An edge device comprising a memory and a processor coupled to the memory; the memory is configured to store instructions, and the processor is configured to execute the instructions; wherein the processor, when executing the instructions, performs the operational steps of the method of any one of the preceding claims 1-5.
12. A data processing system is characterized by comprising a data center and edge equipment, wherein the data center is used for storing an initial neural network model and training the initial neural network model to obtain a trained neural network model; the edge device, being the edge device of any one of claims 6-9 above; or, an edge device as claimed in claim 11 above.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811016411.4A CN110874550A (en) | 2018-08-31 | 2018-08-31 | Data processing method, device, equipment and system |
PCT/CN2019/085468 WO2020042658A1 (en) | 2018-08-31 | 2019-05-05 | Data processing method, device, apparatus, and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811016411.4A CN110874550A (en) | 2018-08-31 | 2018-08-31 | Data processing method, device, equipment and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110874550A true CN110874550A (en) | 2020-03-10 |
Family
ID=69642635
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811016411.4A Pending CN110874550A (en) | 2018-08-31 | 2018-08-31 | Data processing method, device, equipment and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110874550A (en) |
WO (1) | WO2020042658A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523640A (en) * | 2020-04-09 | 2020-08-11 | 北京百度网讯科技有限公司 | Training method and device of neural network model |
CN112085281A (en) * | 2020-09-11 | 2020-12-15 | 支付宝(杭州)信息技术有限公司 | Method and device for detecting safety of business prediction model |
CN113592059A (en) * | 2020-04-30 | 2021-11-02 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer program product for processing data |
WO2021218095A1 (en) * | 2020-04-30 | 2021-11-04 | 深圳市商汤科技有限公司 | Image processing method and apparatus, and electronic device and storage medium |
CN114422380A (en) * | 2020-10-09 | 2022-04-29 | 维沃移动通信有限公司 | Neural network information transmission method, device, communication equipment and storage medium |
WO2022126902A1 (en) * | 2020-12-18 | 2022-06-23 | 平安科技(深圳)有限公司 | Model compression method and apparatus, electronic device, and medium |
CN114692816A (en) * | 2020-12-31 | 2022-07-01 | 华为技术有限公司 | Processing method and equipment of neural network model |
CN114925821A (en) * | 2022-01-05 | 2022-08-19 | 华为技术有限公司 | Compression method of neural network model and related system |
WO2023279975A1 (en) * | 2021-07-06 | 2023-01-12 | 华为技术有限公司 | Model processing method, federated learning method, and related device |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7113958B2 (en) * | 2019-03-11 | 2022-08-05 | 三菱電機株式会社 | Driving support device and driving support method |
CN111522657B (en) * | 2020-04-14 | 2022-07-22 | 北京航空航天大学 | Distributed equipment collaborative deep learning reasoning method |
CN111783997B (en) * | 2020-06-29 | 2024-04-23 | 杭州海康威视数字技术股份有限公司 | Data processing method, device and equipment |
CN111967591B (en) * | 2020-06-29 | 2024-07-02 | 上饶市纯白数字科技有限公司 | Automatic pruning method and device for neural network and electronic equipment |
CN113935390A (en) * | 2020-06-29 | 2022-01-14 | 中兴通讯股份有限公司 | Data processing method, system, device and storage medium |
CN112001483A (en) * | 2020-08-14 | 2020-11-27 | 广州市百果园信息技术有限公司 | Method and device for pruning neural network model |
CN112784967B (en) * | 2021-01-29 | 2023-07-25 | 北京百度网讯科技有限公司 | Information processing method and device and electronic equipment |
CN112786028B (en) * | 2021-02-07 | 2024-03-26 | 百果园技术(新加坡)有限公司 | Acoustic model processing method, apparatus, device and readable storage medium |
CN113011581B (en) * | 2021-02-23 | 2023-04-07 | 北京三快在线科技有限公司 | Neural network model compression method and device, electronic equipment and readable storage medium |
CN116822635B (en) * | 2023-05-12 | 2024-09-24 | 中国科学院深圳先进技术研究院 | Track generation method, device, equipment and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105512723A (en) * | 2016-01-20 | 2016-04-20 | 南京艾溪信息科技有限公司 | Artificial neural network calculating device and method for sparse connection |
CN105640577A (en) * | 2015-12-16 | 2016-06-08 | 深圳市智影医疗科技有限公司 | Method and system automatically detecting local lesion in radiographic image |
US20170061281A1 (en) * | 2015-08-27 | 2017-03-02 | International Business Machines Corporation | Deep neural network training with native devices |
US20170286830A1 (en) * | 2016-04-04 | 2017-10-05 | Technion Research & Development Foundation Limited | Quantized neural network training and inference |
CN107239825A (en) * | 2016-08-22 | 2017-10-10 | 北京深鉴智能科技有限公司 | Consider the deep neural network compression method of load balancing |
CN107609598A (en) * | 2017-09-27 | 2018-01-19 | 武汉斗鱼网络科技有限公司 | Image authentication model training method, device and readable storage medium storing program for executing |
US20180114114A1 (en) * | 2016-10-21 | 2018-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
CN108229533A (en) * | 2017-11-22 | 2018-06-29 | 深圳市商汤科技有限公司 | Image processing method, model pruning method, device and equipment |
CN108416440A (en) * | 2018-03-20 | 2018-08-17 | 上海未来伙伴机器人有限公司 | A kind of training method of neural network, object identification method and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247989B (en) * | 2017-06-15 | 2020-11-24 | 北京图森智途科技有限公司 | Real-time computer vision processing method and device |
CN108229679A (en) * | 2017-11-23 | 2018-06-29 | 北京市商汤科技开发有限公司 | Convolutional neural networks de-redundancy method and device, electronic equipment and storage medium |
-
2018
- 2018-08-31 CN CN201811016411.4A patent/CN110874550A/en active Pending
-
2019
- 2019-05-05 WO PCT/CN2019/085468 patent/WO2020042658A1/en active Application Filing
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170061281A1 (en) * | 2015-08-27 | 2017-03-02 | International Business Machines Corporation | Deep neural network training with native devices |
CN105640577A (en) * | 2015-12-16 | 2016-06-08 | 深圳市智影医疗科技有限公司 | Method and system automatically detecting local lesion in radiographic image |
CN105512723A (en) * | 2016-01-20 | 2016-04-20 | 南京艾溪信息科技有限公司 | Artificial neural network calculating device and method for sparse connection |
US20170286830A1 (en) * | 2016-04-04 | 2017-10-05 | Technion Research & Development Foundation Limited | Quantized neural network training and inference |
CN107239825A (en) * | 2016-08-22 | 2017-10-10 | 北京深鉴智能科技有限公司 | Consider the deep neural network compression method of load balancing |
US20180114114A1 (en) * | 2016-10-21 | 2018-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
CN107609598A (en) * | 2017-09-27 | 2018-01-19 | 武汉斗鱼网络科技有限公司 | Image authentication model training method, device and readable storage medium storing program for executing |
CN108229533A (en) * | 2017-11-22 | 2018-06-29 | 深圳市商汤科技有限公司 | Image processing method, model pruning method, device and equipment |
CN108416440A (en) * | 2018-03-20 | 2018-08-17 | 上海未来伙伴机器人有限公司 | A kind of training method of neural network, object identification method and device |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523640A (en) * | 2020-04-09 | 2020-08-11 | 北京百度网讯科技有限公司 | Training method and device of neural network model |
CN111523640B (en) * | 2020-04-09 | 2023-10-31 | 北京百度网讯科技有限公司 | Training method and device for neural network model |
US11888705B2 (en) | 2020-04-30 | 2024-01-30 | EMC IP Holding Company LLC | Method, device, and computer program product for processing data |
CN113592059A (en) * | 2020-04-30 | 2021-11-02 | 伊姆西Ip控股有限责任公司 | Method, apparatus and computer program product for processing data |
WO2021218095A1 (en) * | 2020-04-30 | 2021-11-04 | 深圳市商汤科技有限公司 | Image processing method and apparatus, and electronic device and storage medium |
CN112085281B (en) * | 2020-09-11 | 2023-03-10 | 支付宝(杭州)信息技术有限公司 | Method and device for detecting safety of business prediction model |
CN112085281A (en) * | 2020-09-11 | 2020-12-15 | 支付宝(杭州)信息技术有限公司 | Method and device for detecting safety of business prediction model |
CN114422380B (en) * | 2020-10-09 | 2023-06-09 | 维沃移动通信有限公司 | Neural network information transmission method, device, communication equipment and storage medium |
CN114422380A (en) * | 2020-10-09 | 2022-04-29 | 维沃移动通信有限公司 | Neural network information transmission method, device, communication equipment and storage medium |
WO2022126902A1 (en) * | 2020-12-18 | 2022-06-23 | 平安科技(深圳)有限公司 | Model compression method and apparatus, electronic device, and medium |
CN114692816A (en) * | 2020-12-31 | 2022-07-01 | 华为技术有限公司 | Processing method and equipment of neural network model |
CN114692816B (en) * | 2020-12-31 | 2023-08-25 | 华为技术有限公司 | Processing method and equipment of neural network model |
WO2023279975A1 (en) * | 2021-07-06 | 2023-01-12 | 华为技术有限公司 | Model processing method, federated learning method, and related device |
CN114925821A (en) * | 2022-01-05 | 2022-08-19 | 华为技术有限公司 | Compression method of neural network model and related system |
CN114925821B (en) * | 2022-01-05 | 2023-06-27 | 华为技术有限公司 | Compression method and related system of neural network model |
Also Published As
Publication number | Publication date |
---|---|
WO2020042658A1 (en) | 2020-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110874550A (en) | Data processing method, device, equipment and system | |
CN111008640A (en) | Image recognition model training and image recognition method, device, terminal and medium | |
KR20200052806A (en) | Operating method of deep learning based climate change prediction system | |
CN111967594A (en) | Neural network compression method, device, equipment and storage medium | |
EP3620982B1 (en) | Sample processing method and device | |
CN111523640A (en) | Training method and device of neural network model | |
CN112905997B (en) | Method, device and system for detecting poisoning attack facing deep learning model | |
CN110096979B (en) | Model construction method, crowd density estimation method, device, equipment and medium | |
CN113095370A (en) | Image recognition method and device, electronic equipment and storage medium | |
CN109446897B (en) | Scene recognition method and device based on image context information | |
CN110751191A (en) | Image classification method and system | |
CN115953643A (en) | Knowledge distillation-based model training method and device and electronic equipment | |
CN107392311A (en) | The method and apparatus of sequence cutting | |
US20210166065A1 (en) | Method and machine readable storage medium of classifying a near sun sky image | |
CN112597919A (en) | Real-time medicine box detection method based on YOLOv3 pruning network and embedded development board | |
CN117671597B (en) | Method for constructing mouse detection model and mouse detection method and device | |
CN113887330A (en) | Target detection system based on remote sensing image | |
CN111339952B (en) | Image classification method and device based on artificial intelligence and electronic equipment | |
CN117095460A (en) | Self-supervision group behavior recognition method and system based on long-short time relation predictive coding | |
CN117152528A (en) | Insulator state recognition method, insulator state recognition device, insulator state recognition apparatus, insulator state recognition program, and insulator state recognition program | |
CN112288702A (en) | Road image detection method based on Internet of vehicles | |
CN110490876B (en) | Image segmentation method based on lightweight neural network | |
CN116826734A (en) | Photovoltaic power generation power prediction method and device based on multi-input model | |
CN114219051B (en) | Image classification method, classification model training method and device and electronic equipment | |
CN116563809A (en) | Road disease identification method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200310 |
|
WD01 | Invention patent application deemed withdrawn after publication |