CN110874550A - Data processing method, device, equipment and system - Google Patents

Data processing method, device, equipment and system Download PDF

Info

Publication number
CN110874550A
CN110874550A CN201811016411.4A CN201811016411A CN110874550A CN 110874550 A CN110874550 A CN 110874550A CN 201811016411 A CN201811016411 A CN 201811016411A CN 110874550 A CN110874550 A CN 110874550A
Authority
CN
China
Prior art keywords
neural network
network model
data
neuron
activation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811016411.4A
Other languages
Chinese (zh)
Inventor
贾贝
傅蓉蓉
高帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201811016411.4A priority Critical patent/CN110874550A/en
Priority to PCT/CN2019/085468 priority patent/WO2020042658A1/en
Publication of CN110874550A publication Critical patent/CN110874550A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a data processing method, which comprises the following steps: the edge device obtains an initial neural network model that includes at least one neuron. The edge device inputs the data to be processed into the trained neural network model to obtain result data; the result data is obtained by processing data to be processed by using neurons in the trained neural network model, the trained neural network model is obtained by training a pruning neural network model by using N groups of training samples, and the pruning neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of at least one neuron. Therefore, the accuracy of data processing is improved, the calculation amount is reduced, and the waste of calculation resources is avoided.

Description

Data processing method, device, equipment and system
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing method, apparatus, device, and system.
Background
With the development of deep learning technology, especially the popularization of convolutional neural networks, it is widely applied to fields such as image processing, face recognition, and the like. At present, a general neural network model is generally designed in the same application scenario for data processing in consideration of the generality of the model, the design cost and the like.
In a video monitoring scene, the actual monitoring environment is not considered, such as different installation heights, angles and the like of the camera. If the same neural network model is adopted to process the images collected by different cameras during data processing, the accuracy of image processing can be influenced. In addition, considering the universality of the neural network model, the neural network model has more designed neurons and more network layers, so that the waste of computing resources also exists in some specific scenes.
Disclosure of Invention
The application discloses a data processing method, a data processing device and a data processing system, which can provide a neural network model suitable for an edge device or a scene where the edge device is located at present so as to perform corresponding data processing, thereby improving the accuracy of data processing and avoiding the waste of computing resources.
In a first aspect, the present application discloses a data processing method, including: the edge device obtains the trained neural network model, and inputs the data to be processed into the trained neural network model to obtain result data of the data to be processed. The trained neural network model is obtained by training the pruning neural network model by using N groups of training samples. The pruning neural network model is obtained by pruning the initial neural network model according to the respective activation information of each neuron in the initial neural network model.
In consideration of the reasons that the model training calculation amount is large, the calculation resources of the edge device are limited, and the like, the training of the neural network model is generally performed on the data center side. In other words, the data center may perform pruning processing on the neurons in the initial neural network model according to the respective activation information of each neuron in the initial neural network model to obtain a pruned neural network model. Further, the data center trains the pruning neural network model using the N sets of training samples to obtain a trained neural network model. And then, the data center sends the trained neural network model to the edge device, so that the edge device can conveniently process data based on the trained neural network model.
The activation information of the neuron refers to relevant information generated by the neuron when the neuron in the initial neural network model is used for data processing. The activation information includes, but is not limited to, an activation value, a number of activations, and an average activation value, as described in more detail below.
By implementing the process, the neural network model suitable for the edge device (or the current scene of the edge device) can be provided, and then the neural network model is used for data processing, so that the accuracy of data processing can be improved. In addition, compared with the general neural network model in the prior art, the model scale can be reduced, the data processing rate is improved, and the waste of computing resources is avoided.
In one possible embodiment, each set of training samples includes input data and output data. The output data is obtained by processing the input data by using an initial neural network model. Specifically, the edge device obtains an initial neural network model from the data center. Then, a plurality of groups of input data of the self-equipment or the current scene of the self-equipment are collected, and the plurality of groups of input data are processed by utilizing the initial neural network model to obtain corresponding output data. Further, each set of input data and output data is taken as a set of training samples, so that a plurality of sets of training samples can be obtained. And sending the multiple groups of training samples to a data center, so that the data center can conveniently retrain the initial neural network model by using the multiple groups of training samples to obtain a trained neural network model suitable for the edge device (or the current scene of the edge device).
Alternatively, when data processing is performed using the initial neural network model, activation information of each neuron in the initial neural network model may be recorded each time, and the like.
By implementing the steps, the edge device can obtain a customized trained neural network model, and the trained neural network model can be suitable for a training sample of the edge device or a scene where the edge device is located, so that a more accurate result can be obtained. Moreover, the training sample does not need manual labeling, user operation can be reduced, and meanwhile the training precision of the neural network model is improved.
In one possible embodiment, the activation information comprises at least one of: an activation value, a number of activations, and an average activation value. The activation value is the output value of a neuron in the initial neural network model each time data processing is carried out by using the neuron. The number of activations is the number of times that the neuron in the initial neural network model is activated a number of times and the activation value of the neuron is less than or equal to a preset threshold (fourth threshold) in data processing. The average activation value is an average value of activation values of neurons in the initial neural network model in data processing performed multiple times. Correspondingly, the process of pruning the initial neural network model by the data center specifically includes at least one of the following: when the activation information includes an activation value, the data center deletes (prunes) neurons having an activation value less than or equal to a first threshold value in the initial neural network model according to the respective activation value of at least one neuron in the initial neural network model. And when the activation information comprises the activation times, the data center deletes the neurons with the activation times smaller than or equal to the second threshold value in the initial neural network model according to the respective activation times of at least one neuron in the initial neural network model. And when the activation information comprises the average activation value, the data center deletes the neurons with the average activation values smaller than or equal to the third threshold value in the initial neural network model according to the respective average activation values of at least one neuron in the initial neural network model.
By implementing the above process, the data center can prune and delete the neurons in the model itself according to the activation information of each neuron in the initial neural network model, so as to obtain the pruned neural network model. The training of the pruning neural network model is facilitated, and the trained neural network model which is suitable for edge equipment or the scene is obtained. Therefore, the model scale can be reduced, the waste of computing resources is reduced, and the data processing efficiency is improved.
In a possible implementation manner, the trained neural network model may be obtained by updating parameters in the pruning neural network model by using a loss function for the data center. The loss function is used to indicate a loss of error between training data, which is output before input data to a fully connected layer obtained in an initial neural network model, and prediction data. The prediction data is output before input data is input into a full connection layer obtained in the pruning neural network model. Specifically, the data center can utilize N groups of training samples to train the pruning neural network model for multiple times, the loss function is used to correct the parameters of the model in each training process, and the one with the minimum loss function value is selected as the parameter of the trained neural network model, so that the trained neural network model is obtained.
By implementing the process, the data center can obtain a trained neural network model with higher accuracy through training of the loss function, so that the subsequent edge equipment side can conveniently and directly utilize the trained neural network model to process data, and the accuracy of data processing is improved.
In a second aspect, the present application provides a data processing apparatus comprising functional modules or units for performing the method as described in the first aspect above or in any possible implementation of the first aspect.
In a third aspect, the present application provides an edge device (e.g., a smart camera, a roadside monitoring device, etc.) comprising a processor, a memory, a communication interface, and a bus; the processor, the communication interface and the memory are communicated with each other through a bus; a communication interface for receiving and transmitting data; a memory to store instructions; a processor for invoking instructions in a memory for performing the method described in the first aspect or any possible implementation manner of the first aspect.
In a fourth aspect, the present application provides a data processing system comprising a data center and edge devices; the data center is used for storing an initial neural network model and training the initial neural network model by using N groups of training samples to obtain a trained neural network model. The edge device comprises a processor, a memory, a communication interface and a bus; the processor, the communication interface and the memory are communicated with each other through a bus; a communication interface for receiving and transmitting data; a memory to store instructions; a processor for invoking instructions in a memory for performing the method described in the first aspect or any possible implementation manner of the first aspect.
In a fifth aspect, the present application provides a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to perform the method of the first aspect described above.
In a sixth aspect, the present application provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of the above aspects.
The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.
Drawings
Fig. 1 is a schematic structural diagram of a YOLO model according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a network framework of a data processing system according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating a data processing method according to an embodiment of the present invention.
Fig. 4A-4B are schematic structural diagrams of two network layers according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Fig. 6 is a schematic structural diagram of an edge device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail below with reference to the accompanying drawings of the present invention.
First, some technical terms related to the present invention are introduced.
The edge device refers to a device installed on the edge network side. For example, in a video monitoring scene, the edge device may be specifically a monitoring device or a smart camera installed on a road.
A neural network model refers to a complex network system formed by a large number of simple processing units (called neurons) interconnected. The neural network model is described on the basis of a mathematical model of a neuron, and has the capabilities of large-scale parallel, distributed storage and processing, self-adaptation, self-learning and the like. See below for a description of neurons. In practical application, the neural network model is composed of at least one network layer of the following network layers: convolutional layers, active layers, pooling layers, and fully-connected layers, etc., the number of deployments of each network layer is not limited in the present invention, and may be one or more. Each network layer is composed of one or more neurons, the parameters (weights) of which may also be referred to as model parameters of the neural network model. The neurons in each network layer may be interconnected, or the neurons in different network layers may be interconnected, as shown in detail in fig. 4A below.
Neurons, also called nodes. Each neuron represents a particular output function, also called the excitation function. When a neural network is used for data processing, it is essential to perform data processing using neurons in a neural network model, that is, to perform data processing using an excitation function of the neurons. It is understood that, because the neural network models are different, the excitation functions of the same neuron in different neural network models may also be different, and the present invention is not limited thereto.
Convolutional layers refer to network layers for feature extraction. In the convolutional layer, a convolution operation may be performed on input data to extract deep level feature data of the input data. The input data is used as an image, the image is input into a convolution layer, and the convolution layer is used for carrying out convolution operation on the image so as to obtain a characteristic image which is hidden by the image and is deep.
And the pooling layer refers to a network layer for data compression. And the activation layer is used for performing activation operation on input data, and essentially performing operation of a specific function so as to enhance the nonlinear expression capability of the data. The fully-connected layer is a network layer which plays a role of a classifier in the neural network model, and each neuron (node) in the fully-connected layer is connected with a neuron in the previous network layer so as to integrate feature data output by the neuron in the previous network layer and obtain a final output result of the neural network model.
In the present invention, the neural network model includes, but is not limited to, a convolutional neural network model, a cyclic neural network model, a deep neural network model, a feedforward neural network model, a deep belief network model, a generative confrontation network model, and other neural network models. For convenience of description, the following embodiments of the present invention are further described by taking the YOLO model in the convolutional neural network model as an example. The YOLO model is a network model for target detection in the convolutional neural network model, for example, a target object included in the detection image, such as a vehicle, a puppy, and the like. The relevant embodiments of the YOLO model are briefly described below.
Fig. 1 is a schematic diagram of a logic structure of a neural network model according to an embodiment of the present invention, where the neural network model may also be referred to as a YOLO model. As shown, the YOLO model includes 9 network layers, wherein the first 7 network layers (shown as layer 1-layer 7) are convolutional layers, and the last 2 network layers (shown as layer 8 and layer 9) are fully-connected layers. Wherein, the number of the neurons included in each network layer may be different.
The activation information refers to information related to the neuron activation, such as an activation value, an activation number, and an average activation value of the neuron. The neuron is activated by performing data operation (data calculation) using a neuron in the neural network model, and the neuron is also considered to be activated for use.
The activation value refers to an output value at which the neuron is activated. In other words, it refers to an output value of a neuron obtained when data operation is performed on input data using the neuron.
The activation times refer to the times when the output value of the neuron is greater than a preset threshold value when the neuron is activated for multiple times. In other words, it refers to the number of times that the activation value (output value) of the neuron is greater than a preset threshold value at each time of statistics when the neuron is used for data calculation for multiple times. The preset threshold is set by a user or a system in a self-defined mode. Specifically, the preset threshold may be obtained by a user or a system according to a series of experimental data statistics, or an empirical value set according to actual experience, and the like, which is not limited in the present invention.
The average activation value refers to an average value of output values of a neuron when the neuron is activated a plurality of times. In other words, it means an average value of activation values of a neuron obtained a plurality of times when the neuron is used for data calculation a plurality of times.
Illustratively, in an application scene of image recognition, the edge device is an intelligent camera. The edge device can periodically acquire a plurality of images in the current scene, and each image is input into the neural network model to be processed so as to obtain corresponding result data. The result data is used to indicate the classification to which the image belongs, e.g. forest, beach, starry sky. Wherein, each time an image is processed by using the neural network model, the image is essentially processed by using the neurons in the neural network model. Accordingly, in multi-image processing, the edge device will process the multiple images using the neurons in the neural network model multiple times. In each image processing process, the edge device records activation information of the neuron, such as an activation value, activation times, an average activation value, and the like, so as to facilitate subsequent updating of the neural network model by using the activation information of the neuron, which is described in detail below.
A loss function, also referred to as a cost function. Refers to an optimization function of a neural network model, which is used for measuring the error degree of prediction. The essence of the training process of the neural network model is to find the minimum value of the loss function, i.e., the process in which the difference between the predicted result and the actual result (also called the loss value) is minimum or the predicted result and the actual result are closest. Understandably, in the actual neural network model training process, the data center obtains the model parameter with the minimum loss function value for one time through training the model parameter in the neural network model for multiple times, and the model parameter is used as the model parameter in the trained neural network model, so that the trained neural network model is obtained.
For example, taking a neural network model as an image recognition model as an example, assume that a training sample for training the image recognition model is a plurality of images, each image is composed of one or more pixel points, and each pixel point corresponds to a respective pixel value (i.e., input real data). Each image comprises a target object, and the pixel value of the image area where the target object is located is different from the pixel values of other areas in the image. In other words, the target object may be distinguished or identified by the pixel values of the image area in which the target object is located.
Correspondingly, the data center inputs a plurality of images into the image identification model, and can perform reasoning calculation on the pixel point in each image so as to obtain the prediction data of the pixel point. Furthermore, the data center compares the predicted data of the pixel points in the multiple images with the real data of the pixel points by using a preset loss function so as to correct the model parameters in the image recognition model. Repeating the above operations, the data center can find out the time when the difference between the predicted data and the real data of the pixel points in the multiple images is minimum, and the model parameters of the trained (corrected) image recognition model are used as the model parameters of the trained image recognition model, so that the trained image recognition model is obtained.
Fig. 2 is a schematic diagram of a network framework of a data processing system according to an embodiment of the present invention. As shown, the data processing system 100 includes a data center 102 and edge devices 104. The data center 102, also referred to as a cloud, includes a plurality of servers in the data center 102, and the trained neural network model (specifically, an initial neural network model or a trained neural network model) is deployed in the data center 102 for downloading and using by the edge device 104.
In practical applications, considering the reasons that the energy consumption requirement of the edge device is high and the computing power is low due to the data processing process, the training of the neural network model is usually completed in the data center 102.
The training process of the neural network model is briefly described below. Specifically, the data center obtains an initial neural network model (a neural network model to be trained or optimized) and a training sample set. Further, the initial neural network model is trained multiple times using the training sample set. And in the multiple training processes, selecting one time with the minimum loss function value to obtain the model parameters of the neural network model generated by the training, wherein the model parameters are used as the model parameters of the trained neural network model, so that the trained neural network model is obtained. The training of the neural network model is described in detail below.
Wherein, the initial neural network model can be determined according to the actual application scene. Illustratively, taking the image recognition scenario as an example, the initial neural network model may be a convolutional neural network model, such as the YOLO model above. Taking a speech recognition scenario as an example, the initial neural network model may be a deep neural network model, which may include one or more network layers, and neurons in two adjacent network layers may be connected to each other, and the invention is not limited thereto.
The training sample set comprises N groups of training samples, and each group of training samples is used for training the initial neural network model to obtain the trained neural network model. Each training sample group comprises input data and output data, and the input data and the output data correspond to each other one by one. N is a positive integer. In different application scenarios, the training samples (i.e., input data and output data) may not be identical.
For example, in an image recognition scenario, assume that an edge device wants to recognize the sun including different periods in the image using a neural network model. Correspondingly, when the training sample set is selected, the input data includes sun images at different time intervals in a day, and the output data includes the position of the sun in each sun image or other characteristic information for identifying the time interval of the sun in the sun image, so that the neural network model obtained by training can accurately identify the time interval classification of the sun in the image.
Accordingly, after the data center obtains the training sample set (i.e., a plurality of sets of training samples including input data and output data), the initial neural network model may be trained by using a plurality of solar images in the training sample set and a real time period of the sun in each solar image, so as to obtain a trained neural network model. Specifically, the data center may input each solar image into an initial neural network model, and process a pixel point in each solar image using the neural network model, so as to predict and obtain a predicted time period in which the sun is located in the solar image according to a position of the sun in the solar image or other feature information identifying the time period in which the sun is located. Further, the data center calculates a difference value between a real time period and a prediction time period of the sun in the sun image by using a preset loss function. Repeating the operation process for many times, the data center can determine the model parameter of the neural network model with the smallest difference value for one time, and the model parameter is used as the model parameter of the trained neural network model, so that the trained neural network model is obtained.
In the invention, the data center utilizes the training samples comprising input data and output data to train the neural network model, and specifically, the initial neural network model is trained in an unsupervised and intelligent training mode without human participation. Compared with the prior art, the training of the neural network model is realized by utilizing the artificially labeled input data, and the error caused by artificial participation can be avoided, so that the accuracy of the training of the neural network model is improved, the artificial participation is not needed, and the convenience of the training of the neural network model can be improved.
For example, in an image recognition application scenario, in a neural network model training process in the conventional technology, an image with a label is usually selected as a training sample, for example, the image is labeled with an object included in the image, such as a puppy, a person, a vehicle, and the like. And the initial neural network model is conveniently trained by directly utilizing the image with the label subsequently.
In the present invention, however, the data center chooses to use the image without annotation as a training sample for training the initial neural network model. In particular, the training samples include input data (here, images) and output data (here, objects included in the images). The output data is input data of an image serving as an initial neural network model, and pixels in the image are calculated by using the initial neural network model, so as to predict and obtain an object included in the image according to feature information (such as contour features of the object, an identifier of the object, and the like) of the object included in the image. Therefore, the training of the neural network model is carried out by adopting the training sample without artificial marking, so that the convenience of training the neural network model can be improved; meanwhile, errors caused by artificial labeling are avoided, and the accuracy of the neural network model can be improved.
The edge device 104 may obtain an initial neural network model or a trained neural network model from the data center 102, so as to perform corresponding data processing by using the obtained neural network model. In different application scenarios, the actual data processing performed by using the acquired neural network model is also different.
In an image recognition scenario, the input data is an image including feature information of the vehicle to be recognized, such as an identification of the vehicle, a contour of the vehicle, and the like, for identifying features of the vehicle. Accordingly, the edge device may input the image into the trained neural network model, and obtain the vehicle included in the image according to the feature information of the vehicle in the image. As another example, in a speech recognition scenario, the input data is speech to be recognized. Accordingly, the edge device may input the speech to be recognized into the trained neural network model, and recognize and process the characteristic information, such as frequency or wavelength, included in the speech to be recognized, so as to obtain the text information and the like corresponding to the speech to be recognized, which is not limited in the present invention.
Alternatively, the above training sample set (N training samples) may be used for the data center 102 to train the initial neural network model by using the training sample set after the edge device 104 collects data and sends the collected data to the data center 102. Specifically, the edge device 104 may first obtain an initial neural network model from the data center 102. Then, the edge device can acquire N groups of input data through self device or other edge devices, and input each group of input data into the initial neural network model to acquire corresponding output data. Based on this principle, the edge device can obtain a training sample set including N sets of training samples, where each set of training samples includes input data and output data, and details are not repeated here.
It is understood that in different application scenarios, the type of the input data or the output data may be different, and may include, but is not limited to, image data, voice data, text data, and the like. Specifically, when the edge device has data collection capability, the edge device can collect corresponding input data through the device itself. The image processing scene is taken as the situation that the edge device is provided with a camera, and the edge device can acquire one or more images of the current scene through the camera to be used as input data. On the contrary, when the edge device does not have the data acquisition capability, the edge device can acquire corresponding input data through other edge devices with the data acquisition capability. Further, the edge device obtains the corresponding training sample according to the obtained input data, which may specifically refer to the related explanation in the foregoing embodiments, and details are not described here.
In the present invention, the number of the edge devices 104 is not limited, and may be one or more, where n is illustrated as an example, and n is a positive integer. The edge device may be, for example, an intelligent camera, an intelligent mobile phone, a tablet computer, a palm computer, a notebook computer, or the like, and the present invention is not limited thereto.
The following illustrates a specific implementation process of the edge device to obtain the training sample set. Taking an image classification application scene as an example, the edge device is a monitoring device. Accordingly, the edge device may capture multiple images of the current scene as input data to the neural network model. Further, the edge device may obtain an initial neural network model from the data center, and process a plurality of input images as input data by using the initial neural network model to obtain a classification to which each input image belongs, for example, the input images are a character image, a starry sky image, a sea view image, a beach image, and the like. Accordingly, the edge device may treat each input image and the class to which the input image belongs as a set of training samples, thereby obtaining a training sample set including a plurality of sets of training samples. Optionally, the edge device may send the training sample set to the data center, so that the data center can retrain the initial neural network model by using the training sample set, which is not described herein again.
Next, a related embodiment of a data processing method to which the present invention relates will be described. Fig. 3 is a schematic flow chart of a data processing method according to an embodiment of the present invention. The method as shown in fig. 3 comprises the following implementation steps:
step S301, the edge device obtains an initial neural network model from the data center.
In the invention, the initial neural network model may also be referred to as a full-scale model, which may be a general model obtained by training the data center with a training sample set in advance, and the general model is suitable for all or part of application scenarios. For example, the YOLO model commonly used in the field of object detection.
Step S302, the edge device acquires N groups of input data, and inputs the N groups of input data into the initial neural network model respectively to obtain output data of the N groups of input data and respective activation information of each neuron in the initial neural network model.
The edge device can collect corresponding input data when performing data processing (reasoning calculation) by using the initial neural network model. In different application scenarios, the input data may be different, for example, it may be image data, text data, voice data, and the like, and specific reference may be made to relevant descriptions in the foregoing embodiments, which are not described herein again. Further, the edge device may input the input data into an initial neural network model for calculation, and obtain corresponding output data. Based on the principle, the edge device performs N times of reasoning calculation by using an initial neural network model. That is, the edge device may collect N sets of input data, and process the N sets of input data using the initial neural network model N times to obtain output data corresponding to the N sets of input data.
Optionally, the edge device takes a set of input data and output data obtained at a time as a set of training samples. When the initial neural network model is calculated by using N sets of input data, the edge device may obtain N sets of training samples, each set of training samples including input data and output data, the output data being data obtained by processing the input data by using the initial neural network model. The initial neural network model is used by the edge device every time one output data is obtained.
Optionally, the edge device essentially computes a set of input data using neurons in the initial neural network model each time the input data is processed using the initial neural network model. In the calculation process, the edge device can also record activation information generated when each neuron in the initial neural network model performs calculation (is activated). In other words, the edge device may record the respective activation information of each neuron each time a data calculation is performed using each neuron in the initial neural network model. The activation information includes, but is not limited to, any one or combination of more of the following: an activation value, a number of activations, and an average activation value. For the description of the activation information, reference may be made to the foregoing embodiments, which are not described in detail herein.
Specifically, each time the edge device calculates the input data by using the neurons in the initial neural network model, the edge device may record the respective activation values of each neuron in the initial neural network model. When the edge device performs data processing using the neurons in the initial neural network model multiple times (e.g., M times), the edge device may count and record information such as the respective activation times and the average activation value of each neuron in the initial neural network model. The number of activation times may be a number of times, which is greater than or equal to a preset threshold value, of M activation values of a neuron when the neuron in the initial neural network model is used for data processing M times, and the number of activation times of the neuron is taken as the number of activation times of the neuron. M is a positive integer less than or equal to N. The average activation value may be an average of M activation values of a neuron when the neuron in the initial neural network model is used for data processing M times, and the average activation value of the neuron is used as the average activation value of the neuron.
Step S303, the edge device sends the N sets of training samples and the activation information of each neuron in the initial neural network model to the data center.
Accordingly, the data center receives N sets of training samples and the respective activation information for each neuron. The training sample includes input data and output data, where the output data is data obtained by calculating the input data as an input of the initial neural network model, and for the input data and the output data, reference may be made to the relevant explanations in the foregoing embodiments, and details are not described here.
The edge device sends the N groups of training samples and the respective activation information of each neuron to the data center, so that the data center can conveniently retrain the initial neural network model by using the information to obtain the neural network model adaptive to the edge device (or the deployment scene of the edge device). Therefore, the neural network models belonging to different edge devices are trained conveniently, the actual requirements of different edge devices are met, and the practicability of model processing is improved. In other words, the invention can retrain the personalized neural network model according to the edge device or the deployment scene of the edge device, so as to meet the real-time requirement of the edge device.
And S304, the data center performs pruning processing on the neurons in the initial neural network model according to the respective activation information of each neuron to obtain a pruning neural network model.
In the invention, the data center deletes the neurons meeting any one or more of the following conditions according to the respective activation information of each neuron in the initial neural network model so as to obtain a corresponding pruning neural network model. The conditions specifically include:
1) the activation value of the neuron is less than or equal to a first threshold;
2) the number of activations of neurons is less than or equal to a second threshold;
3) the average activation value of the neurons is less than or equal to a third threshold value.
Therefore, neurons which are not suitable for the edge device or the deployment scene of the edge device can be cut off conveniently, so that the model scale is reduced, the calculation amount is reduced, and the calculation resources are saved.
The first threshold, the second threshold, and the third threshold may be specifically set by a user or a system, and they may be the same or different, and the present invention is not limited thereto. For example, if the system wants to obtain a pruning neural network model with higher computational accuracy, the three thresholds can be set to be larger, for example, 5, etc. On the contrary, the system wants to obtain a pruning neural network model with lower calculation accuracy, and the three thresholds can be set to be smaller, for example, 0.01, etc. Optionally, the three thresholds may be obtained by a system according to a series of statistical data statistics, or empirical values set by a user according to actual experience, and the like, which is not limited in the present invention.
For example, fig. 4A shows a schematic structural diagram of two network layers in the YOLO model. As shown in FIG. 4A, the Nth network layer (referred to as Nth layer, specifically, any one of layers 1-8 in the YOLO model in FIG. 1) includes 6 neurons, each of which is On1,On2,…On6. The N +1 th layer includes 4 neurons, O respectively(n+1)1,O(n+1)2,…O(n+1)4. The neurons of the two adjacent layers adopt a full connection mode, namely, each neuron of the Nth layer is connected with all the neurons of the (N + 1) th layer. In the pruning process, respective activation values of 6 neurons in the Nth layer are assumed to be 0, 1, 0, 0, 1 and 1; the activation values of the 4 neurons in the N +1 th layer are 1, 0, 1 and 1 respectively. Accordingly, the data center performs pruning deletion on the neurons with activation values less than or equal to 0, so as to obtain the pruning neural network model shown in fig. 4B. That is, the data center will have neuron O in layer Nn1,On3And On4Deleting the neuron O in the N +1 th layer(n+1)2DeletingThereby obtaining a pruning neural network model including two network layers as shown in fig. 4B.
And S305, training the pruning neural network model by the data center according to the N groups of training samples to obtain the trained neural network model.
In the invention, the data center updates the model parameters in the pruning neural network model by using the loss function in the training process of the pruning neural network model so as to obtain the trained neural network model. Wherein the loss function is used to indicate a loss of error between training data obtained by inputting the input data into the first neural network model and prediction data obtained by inputting the input data into the second neural network model.
The first neural network model is a network model except a preset classification algorithm in the initial neural network model. The second neural network model is a network model except for a preset classification algorithm in the pruning neural network model. The preset classification algorithm is an algorithm or a rule in the model for calculating an output result, such as an image classification rule softmax and the like.
In practical applications, the predetermined classification algorithm is usually designed in the fully connected layer of the model. When the preset classification algorithm is designed in the full connection layer, the training data may be input data into an initial neural network model to obtain data output before the full connection layer. The prediction data may be specifically data output before input data is input to a full connection layer obtained in the pruning neural network model.
For example, referring to the YOLO model shown in fig. 1, the initial neural network model is a first YOLO model, the pruned neural network model is a second YOLO model, and the first and second YOLO models each include 7 convolutional layers and 2 fully-connected layers, but the neurons included in each network layer may be different. Accordingly, the training data in this example may be the data output from the last convolutional layer (i.e., the seventh convolutional layer, or a network layer before the fully-connected layer) obtained by inputting the input data into the first YOLO model. The predicted data may be output data of the seventh convolutional layer (i.e., the last convolutional layer) obtained by inputting the input data into the second YOLO model.
Understandably, in order to ensure the accuracy of model training, the data center can utilize N groups of training samples to train the pruning neural network model for multiple times so as to obtain a trained neural network model with higher accuracy. In the model training process, the data center corrects the model parameters in the model by using a preset loss function to obtain a trained neural network model. The essence of the model training process is that the data center continuously calculates the value of the loss function, and the model parameter corresponding to the time with the minimum value of the loss function is selected as the model parameter of the trained neural network model, so that the trained neural network model is obtained. It will be appreciated that the specific expression of the loss function may vary from neural network model to neural network model, and the present invention will be described below in an example.
Step S306, the data center updates the initial neural network model into a trained neural network model.
And step S307, the edge device acquires the trained neural network model from the data center.
Step S308, the edge device acquires data to be processed, inputs the data to be processed into the trained neural network model, and acquires result data corresponding to the data to be processed.
The data center may store the trained neural network model. Optionally, the data center may replace the initial neural network model with the trained neural network model, that is, the initial neural network model is updated to the trained neural network model for downloading and use by the edge device.
Optionally, the data center may train and obtain a trained neural network model suitable for the own device or the scene where the own device is located for each edge device according to the training principle of the neural network model. After the data center trains the trained neural network model suitable for each of the at least one edge device for the at least one edge device, the data center may store the trained neural network model in association with the edge device (specifically, may be an identifier of the edge device), so as to identify which edge device the trained neural network model is personalized, or which edge device the trained neural network model is suitable for.
Accordingly, the edge device can obtain the trained neural network model corresponding to the edge device from the data center according to actual requirements. Specifically, the edge device may send an acquisition request to the data center, where the acquisition request carries an identifier (e.g., a device name, a device ID number, etc.) of the edge device, and the acquisition request is used to request to acquire a trained neural network model adapted to the edge device. After receiving the acquisition request, the data center queries the trained neural network model corresponding to the identifier of the edge device according to the identifier of the edge device in the request, and sends the trained neural network model to the edge device.
Accordingly, the edge device side can receive the trained neural network model sent by the data center, and accordingly data processing can be conveniently carried out subsequently based on the trained neural network model. For example, the edge device may obtain data to be processed through its own device or other devices, and input the data to be processed into the trained neural network model for processing, so as to obtain corresponding result data, where the result data is used to indicate a result corresponding to the data to be processed. In different application scenarios, the data to be processed and the result data are different.
Illustratively, in the image classification scene, the data to be processed is an image to be classified, and the image is composed of at least one pixel point. Correspondingly, the edge device can input the image to be classified into the trained neural network model, and each neuron in the trained neural network model is used for calculating each pixel point in the image to be classified so as to obtain result data corresponding to the image to be classified. The result data is used to indicate the classification to which the image to be classified belongs, and may be, for example, a human image, a starry sky image, a beach image, a forest image, and the like.
As another example, in a speech recognition scenario, the data to be processed may be speech to be recognized. Correspondingly, the edge device can input the speech to be recognized into the trained neural network model, so that each neuron in the trained neural network model is used for calculating the speech to be recognized, and the result data corresponding to the speech to be recognized is obtained. The result data may be text information corresponding to the speech to be recognized. In other words, the trained neural network model can be used to realize the translation conversion from speech to text.
For the convenience of understanding the technical content of the present invention, the following description will be made in detail with reference to the YOLO model for vehicle detection.
Firstly, the data center can collect monitoring images of different traffic intersections, and an initial YOLO model is obtained by utilizing the monitoring images for training. The edge device can obtain the initial YOLO model from the data center according to actual requirements, and the model can be conveniently retrained subsequently by combining the edge device with the edge device or the scene where the edge device is located. Further, the edge device may capture an image of a vehicle in a preset time period (for example, one month, etc.) in the current scene, and input the image of the vehicle into the initial YOLO model for processing, so as to identify characteristic information (for example, a vehicle identifier, a license plate number, a vehicle contour, etc.) of the vehicle in the image of the vehicle. And the edge device obtains the target vehicle included in the vehicle image according to the characteristic information of the vehicle in the vehicle image. At the same time, the activation value of each neuron in the initial YOLO model can also be recorded. The edge device also sends the vehicle image, the target vehicle included in the vehicle image, and the activation value of each neuron in the initial YOLO model to the data center.
Accordingly, the data center deletes (prunes) the neurons with an activation value of 0 in the initial YOLO model according to the activation value of each neuron, so as to obtain a corresponding pruned neural network model. Further, the data center takes the vehicle image and the target vehicle included in the vehicle image as training samples, and trains the pruning neural network model again by using multiple groups of training samples. Specifically, the data center may adjust model parameters of the pruned neural network model using the loss function of the following formula (1) to obtain a trained YOLO model.
Figure BDA0001784857500000111
Where loss is the loss function. And i is a pixel point i included in the vehicle image. t is the total number of pixel points forming the vehicle image, in other words, t pixel points are included in the vehicle image. PiThe method is used for processing data output after pixel points i are processed by using a pruning neural network model with a full connection layer (specifically, classification rule softmax deployed in the full connection layer) removed. FiThe data output after processing the pixel point i by using the initial YOLO model of removing the full connection layer (specifically, the classification rule softmax deployed in the full connection layer) is shown.
And (3) training and adjusting model parameters in the pruning neural network model by the data center according to the loss function shown in the formula (1). Optionally, to ensure the model accuracy, a set of model parameters with the smallest loss function value is searched in the training process to serve as the model parameters of the trained YOLO model, so as to obtain the trained YOLO model. The invention is not limited and detailed herein. Optionally, the data center may store the identifier of the edge device and the trained YOLO model in an associated manner, so that the edge device side can subsequently obtain the trained YOLO model from the data center according to actual requirements.
Accordingly, the edge device can obtain the trained YOLO model from the data center, so that the trained YOLO model can be conveniently used for vehicle detection subsequently. For example, assume that the edge device is a monitoring device deployed on a road, and the edge device may acquire and obtain a to-be-processed image including a to-be-detected vehicle. Correspondingly, the edge device takes the image to be processed as the input of the trained YOLO model, calculates each pixel point of the image to be processed by using the neuron of each network layer in the trained YOLO model, identifies the characteristic information (such as vehicle identification, license plate number, vehicle outline and the like) of the vehicle to be detected in the image to be processed, and further learns the vehicle to be detected, so that the vehicle detection can be realized conveniently and efficiently.
By implementing the embodiment of the invention, different trained neural network models can be trained for different edge devices or deployment scenes of the edge devices to form customized neural network models, and the edge devices can analyze and process data based on the customized neural network models, so that the precision and the processing efficiency can be improved. In addition, in the embodiment of the invention, the collected data does not need to be marked manually, and the neural network model of the data center analyzes and processes the collected data according to the sample data sent by each edge device to obtain the processed data, thereby further improving the processing efficiency of each edge device. Because each edge device can process data according to the customized neural network model, the processing result can be obtained in a short time, and the time consumption of data processing is reduced.
The foregoing describes in detail a related embodiment of the data processing method provided in the embodiment of the present invention with reference to fig. 1 to 3. The following describes a data processing apparatus, a device, and a system according to an embodiment of the present invention with reference to fig. 5 to 6.
Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention. The data processing apparatus 500 shown in fig. 5, applied to the edge device side, may include an obtaining module 501 and a processing module 502; wherein,
the obtaining module 501 is configured to obtain an initial neural network model, where the initial neural network model includes at least one neuron, and the neuron is configured to process input data of the initial neural network model to obtain activation information of the neuron;
the processing module 502 is configured to input data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
In one possible embodiment, the training samples include input data and output data, wherein the output data is obtained by calculating the input data using the initial neural network model.
In one possible embodiment, the activation information comprises at least one of: an activation value, a number of activations, and an average activation value; the pruning of the at least one neuron in the initial neural network model according to the respective activation information of the at least one neuron includes: when the activation information comprises an activation value, deleting the neurons corresponding to the activation values smaller than or equal to a first threshold value in the initial neural network model according to the respective activation values of the at least one neuron; when the activation information comprises activation times, deleting the neurons corresponding to the activation times smaller than or equal to a second threshold value in the initial neural network model according to the respective activation times of the at least one neuron; when the activation information comprises an average activation value, deleting the neurons corresponding to the average activation value smaller than or equal to a third threshold value in the initial neural network model according to the respective average activation value of the at least one neuron; the activation value is an output value of a neuron in the initial neural network model when the neuron performs data processing each time, the activation times are times when the activation value of the neuron in the initial neural network model is greater than or equal to a fourth threshold in the data processing performed by using the neuron in the initial neural network model for M times, the average activation value is an average value of the activation values of the neuron in the data processing performed by using the neuron in the initial neural network model for M times, and M is a positive integer less than or equal to N.
In a possible implementation manner, the trained neural network model is obtained by updating parameters in the pruning neural network model by using a loss function; the loss function is used for indicating error loss between training data and prediction data, the training data is data output before a full-connection layer obtained by inputting input data in the training sample into the initial neural network model, and the prediction data is data output before the full-connection layer obtained by inputting input data in the training sample into the pruning neural network model.
In a possible implementation, the obtaining module 501 is specifically configured to obtain an initial neural network model from a data center.
It should be understood that the apparatus 500 of the embodiment of the present invention may be implemented by an application-specific integrated circuit (ASIC), or a Programmable Logic Device (PLD), which may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof. When the data processing method shown in fig. 3 can also be implemented by software, the apparatus and its respective modules may also be software modules.
The data processing apparatus 500 provided in the embodiment of the present invention may be correspondingly applied to execute the method provided in the embodiment of the present invention, and the functions of each module and/or other operations executed in the apparatus 500 are respectively for executing the flow steps of the method corresponding to fig. 3, and are not described herein again for brevity.
By implementing the embodiment of the invention, different trained neural network models can be designed for different edge devices or deployment scenes of the edge devices. The data processing is conveniently carried out by subsequently utilizing the neural network model suitable for the edge device or the deployment scene of the edge device, and the accuracy of the data processing is improved.
Fig. 6 is a schematic structural diagram of an edge device according to an embodiment of the present invention. The edge device 600 shown in fig. 6 may include one or more processors 601, communication interfaces 602, and memories 603, and the processors 601, the communication interfaces 602, and the memories 603 may be connected by a bus, and may also implement communication by other means such as wireless transmission. The embodiment of the present invention is exemplified by being connected through a bus 604, wherein the memory 603 is used for storing instructions, and the processor 601 is used for executing the instructions stored by the memory 503. The memory 603 stores program code, and the processor 601 may call the program code stored in the memory 603 to perform the following operations:
acquiring an initial neural network model, wherein the initial neural network model comprises at least one neuron, and the neuron is used for processing input data of the initial neural network model to obtain activation information of the neuron;
inputting data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
Optionally, in this embodiment of the present invention, the processor 601 may call the program code stored in the memory 603 to perform all or part of the steps described in the embodiment of the method illustrated in fig. 3, and/or other contents described in the text, and so on, which are not described herein again.
It should be appreciated that processor 601 may be comprised of one or more general-purpose processors, such as a Central Processing Unit (CPU). The processor 601 may be used to run the programs of the following functional modules in the related program code. The functional module may specifically include, but is not limited to, the above-described functional modules such as the acquisition module and/or the processing module. That is, the processor 601 executes the functions of any one or more of the functional modules described above. For each functional module mentioned herein, reference may be made to the relevant explanations in the foregoing embodiments, and details are not described here.
The communication interface 602 may be a wired interface (e.g., an ethernet interface) or a wireless interface (e.g., a cellular network interface or using a wireless local area network interface) for communicating with other modules/devices. For example, in the embodiment of the present application, the communication interface 602 may be specifically configured to receive an initial neural network model or a trained neural network model sent by a data center.
The Memory 603 may include a Volatile Memory (Volatile Memory), such as a Random Access Memory (RAM); the Memory may also include a Non-volatile Memory (Non-volatile Memory), such as a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk (Hard Disk Drive, HDD), or a Solid-State Drive (SSD); the memory 603 may also comprise a combination of memories of the kind described above. The memory 603 may be used to store a set of program codes for the processor 601 to call the program codes stored in the memory 603 to implement the functions of the above-mentioned functional modules involved in the embodiments of the present invention.
It should be understood that the edge device 600 according to the embodiment of the present invention may correspond to the data processing apparatus 500 shown in fig. 5 in the embodiment of the present invention, and may correspond to an operation step for executing the main body on the edge device side in the method shown in fig. 3 in the embodiment of the present invention, and the above-mentioned step and other operations and/or functions of each module in the edge device are respectively for implementing corresponding flows of each method in fig. 3, and are not described herein again for brevity.
It should be noted that fig. 6 is only one possible implementation manner of the embodiment of the present invention, and in practical applications, the edge device may further include more or less components, which is not limited herein. For the content that is not shown or not described in the embodiment of the present invention, reference may be made to the related explanation in the embodiment described in fig. 1 to fig. 3, and details are not described here.
By implementing the embodiment of the invention, different trained neural network models can be designed for different edge devices or deployment scenes of the edge devices. The data processing is conveniently carried out by subsequently utilizing the neural network model suitable for the edge device or the deployment scene of the edge device, and the accuracy of the data processing is improved.
Embodiments of the present invention further provide a data processing system, which includes the data center 102 and the edge device 104 shown in fig. 2. Wherein, an initial neural network model or a trained neural network model is deployed in the data center 102. The edge device comprises a processor, a memory, a communication interface and a bus; the processor, the communication interface and the memory are communicated with each other through a bus; a communication interface for receiving and transmitting data; a memory to store instructions; the processor is configured to call the instruction in the memory, and perform all or part of the implementation steps described in the embodiment of the method illustrated in fig. 3, which is not described herein again.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded or executed on a computer, cause the flow or functions according to embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a Solid State Drive (SSD).
The foregoing is only illustrative of the present invention. Those skilled in the art can conceive of changes or substitutions based on the specific embodiments provided by the present invention, and all such changes or substitutions are intended to be included within the scope of the present invention.

Claims (12)

1. A method of data processing, the method comprising:
the method comprises the steps that an edge device obtains an initial neural network model, wherein the initial neural network model comprises at least one neuron, and the neuron is used for processing input data of the initial neural network model to obtain activation information of the neuron;
the edge device inputs data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
2. The method of claim 1, wherein the training samples comprise input data and output data, wherein the output data is computed from the input data using the initial neural network model.
3. The method according to claim 1 or 2, characterized in that the activation information comprises at least one of the following: an activation value, a number of activations, and an average activation value;
the pruning of the at least one neuron in the initial neural network model according to the respective activation information of the at least one neuron includes:
when the activation information comprises an activation value, deleting the neurons corresponding to the activation values smaller than or equal to a first threshold value in the initial neural network model according to the respective activation values of the at least one neuron;
when the activation information comprises activation times, deleting the neurons corresponding to the activation times smaller than or equal to a second threshold value in the initial neural network model according to the respective activation times of the at least one neuron;
when the activation information comprises an average activation value, deleting the neurons corresponding to the average activation value smaller than or equal to a third threshold value in the initial neural network model according to the respective average activation value of the at least one neuron;
the activation value is an output value of a neuron in the initial neural network model when the neuron performs data processing each time, the activation times are times when the activation value of the neuron in the initial neural network model is greater than or equal to a fourth threshold in the data processing performed by using the neuron in the initial neural network model for M times, the average activation value is an average value of the activation values of the neuron in the data processing performed by using the neuron in the initial neural network model for M times, and M is a positive integer less than or equal to N.
4. The method according to any one of claims 1 to 3, wherein the trained neural network model is obtained by training a pruning neural network model using N sets of training samples, and comprises:
the trained neural network model is obtained by updating parameters in the pruning neural network model by using a loss function based on N groups of training samples;
the loss function is used for indicating error loss between training data and prediction data, the training data is data output before a full-connection layer obtained by inputting input data in the training sample into the initial neural network model, and the prediction data is data output before the full-connection layer obtained by inputting input data in the training sample into the pruning neural network model.
5. The method of claims 1-4, wherein the edge device obtaining the initial neural network model comprises:
the edge device obtains an initial neural network model from the data center.
6. An edge device, characterized in that the edge device comprises an acquisition module and a processing module; wherein,
the obtaining module is configured to obtain an initial neural network model, where the initial neural network model includes at least one neuron, and the neuron is configured to process input data of the initial neural network model to obtain activation information of the neuron;
the processing module is used for inputting data to be processed into the trained neural network model to obtain result data;
the result data is obtained by processing the data to be processed by using the neurons in the trained neural network model, the trained neural network model is obtained by training a pruned neural network model by using N groups of training samples, the pruned neural network model is obtained by pruning at least one neuron in the initial neural network model according to respective activation information of the at least one neuron, the activation information is respective information of the at least one neuron when the at least one neuron in the initial neural network model is used for data processing, and N is a positive integer.
7. The edge device of claim 6, wherein the training samples comprise input data and output data, wherein the output data is computed from the input data using the initial neural network model.
8. The edge device of claim 6 or 7, wherein the activation information comprises at least one of: an activation value, a number of activations, and an average activation value;
the pruning of the at least one neuron in the initial neural network model according to the respective activation information of the at least one neuron includes:
when the activation information comprises an activation value, deleting the neurons corresponding to the activation values smaller than or equal to a first threshold value in the initial neural network model according to the respective activation values of the at least one neuron;
when the activation information comprises activation times, deleting the neurons corresponding to the activation times smaller than or equal to a second threshold value in the initial neural network model according to the respective activation times of the at least one neuron;
when the activation information comprises an average activation value, deleting the neurons corresponding to the average activation value smaller than or equal to a third threshold value in the initial neural network model according to the respective average activation value of the at least one neuron;
the activation value is an output value of a neuron in the initial neural network model when the neuron performs data processing each time, the activation times are times when the activation value of the neuron in the initial neural network model is greater than or equal to a fourth threshold in the data processing performed by using the neuron in the initial neural network model for M times, the average activation value is an average value of the activation values of the neuron in the data processing performed by using the neuron in the initial neural network model for M times, and M is a positive integer less than or equal to N.
9. The edge device of any of claims 6-8, wherein the trained neural network model is further obtained by updating parameters in the pruned neural network model using a loss function;
the loss function is used for indicating error loss between training data and prediction data, the training data is data output before a full-connection layer obtained by inputting input data in the training sample into the initial neural network model, and the prediction data is data output before the full-connection layer obtained by inputting input data in the training sample into the pruning neural network model.
10. The edge device according to any of claims 6 to 9,
the obtaining module is specifically configured to obtain an initial neural network model from a data center.
11. An edge device comprising a memory and a processor coupled to the memory; the memory is configured to store instructions, and the processor is configured to execute the instructions; wherein the processor, when executing the instructions, performs the operational steps of the method of any one of the preceding claims 1-5.
12. A data processing system is characterized by comprising a data center and edge equipment, wherein the data center is used for storing an initial neural network model and training the initial neural network model to obtain a trained neural network model; the edge device, being the edge device of any one of claims 6-9 above; or, an edge device as claimed in claim 11 above.
CN201811016411.4A 2018-08-31 2018-08-31 Data processing method, device, equipment and system Pending CN110874550A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811016411.4A CN110874550A (en) 2018-08-31 2018-08-31 Data processing method, device, equipment and system
PCT/CN2019/085468 WO2020042658A1 (en) 2018-08-31 2019-05-05 Data processing method, device, apparatus, and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811016411.4A CN110874550A (en) 2018-08-31 2018-08-31 Data processing method, device, equipment and system

Publications (1)

Publication Number Publication Date
CN110874550A true CN110874550A (en) 2020-03-10

Family

ID=69642635

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811016411.4A Pending CN110874550A (en) 2018-08-31 2018-08-31 Data processing method, device, equipment and system

Country Status (2)

Country Link
CN (1) CN110874550A (en)
WO (1) WO2020042658A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111523640A (en) * 2020-04-09 2020-08-11 北京百度网讯科技有限公司 Training method and device of neural network model
CN112085281A (en) * 2020-09-11 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for detecting safety of business prediction model
CN113592059A (en) * 2020-04-30 2021-11-02 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for processing data
WO2021218095A1 (en) * 2020-04-30 2021-11-04 深圳市商汤科技有限公司 Image processing method and apparatus, and electronic device and storage medium
CN114422380A (en) * 2020-10-09 2022-04-29 维沃移动通信有限公司 Neural network information transmission method, device, communication equipment and storage medium
WO2022126902A1 (en) * 2020-12-18 2022-06-23 平安科技(深圳)有限公司 Model compression method and apparatus, electronic device, and medium
CN114692816A (en) * 2020-12-31 2022-07-01 华为技术有限公司 Processing method and equipment of neural network model
CN114925821A (en) * 2022-01-05 2022-08-19 华为技术有限公司 Compression method of neural network model and related system
WO2023279975A1 (en) * 2021-07-06 2023-01-12 华为技术有限公司 Model processing method, federated learning method, and related device

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7113958B2 (en) * 2019-03-11 2022-08-05 三菱電機株式会社 Driving support device and driving support method
CN111522657B (en) * 2020-04-14 2022-07-22 北京航空航天大学 Distributed equipment collaborative deep learning reasoning method
CN111783997B (en) * 2020-06-29 2024-04-23 杭州海康威视数字技术股份有限公司 Data processing method, device and equipment
CN111967591B (en) * 2020-06-29 2024-07-02 上饶市纯白数字科技有限公司 Automatic pruning method and device for neural network and electronic equipment
CN113935390A (en) * 2020-06-29 2022-01-14 中兴通讯股份有限公司 Data processing method, system, device and storage medium
CN112001483A (en) * 2020-08-14 2020-11-27 广州市百果园信息技术有限公司 Method and device for pruning neural network model
CN112784967B (en) * 2021-01-29 2023-07-25 北京百度网讯科技有限公司 Information processing method and device and electronic equipment
CN112786028B (en) * 2021-02-07 2024-03-26 百果园技术(新加坡)有限公司 Acoustic model processing method, apparatus, device and readable storage medium
CN113011581B (en) * 2021-02-23 2023-04-07 北京三快在线科技有限公司 Neural network model compression method and device, electronic equipment and readable storage medium
CN116822635B (en) * 2023-05-12 2024-09-24 中国科学院深圳先进技术研究院 Track generation method, device, equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 Artificial neural network calculating device and method for sparse connection
CN105640577A (en) * 2015-12-16 2016-06-08 深圳市智影医疗科技有限公司 Method and system automatically detecting local lesion in radiographic image
US20170061281A1 (en) * 2015-08-27 2017-03-02 International Business Machines Corporation Deep neural network training with native devices
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN107239825A (en) * 2016-08-22 2017-10-10 北京深鉴智能科技有限公司 Consider the deep neural network compression method of load balancing
CN107609598A (en) * 2017-09-27 2018-01-19 武汉斗鱼网络科技有限公司 Image authentication model training method, device and readable storage medium storing program for executing
US20180114114A1 (en) * 2016-10-21 2018-04-26 Nvidia Corporation Systems and methods for pruning neural networks for resource efficient inference
CN108229533A (en) * 2017-11-22 2018-06-29 深圳市商汤科技有限公司 Image processing method, model pruning method, device and equipment
CN108416440A (en) * 2018-03-20 2018-08-17 上海未来伙伴机器人有限公司 A kind of training method of neural network, object identification method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247989B (en) * 2017-06-15 2020-11-24 北京图森智途科技有限公司 Real-time computer vision processing method and device
CN108229679A (en) * 2017-11-23 2018-06-29 北京市商汤科技开发有限公司 Convolutional neural networks de-redundancy method and device, electronic equipment and storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170061281A1 (en) * 2015-08-27 2017-03-02 International Business Machines Corporation Deep neural network training with native devices
CN105640577A (en) * 2015-12-16 2016-06-08 深圳市智影医疗科技有限公司 Method and system automatically detecting local lesion in radiographic image
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 Artificial neural network calculating device and method for sparse connection
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN107239825A (en) * 2016-08-22 2017-10-10 北京深鉴智能科技有限公司 Consider the deep neural network compression method of load balancing
US20180114114A1 (en) * 2016-10-21 2018-04-26 Nvidia Corporation Systems and methods for pruning neural networks for resource efficient inference
CN107609598A (en) * 2017-09-27 2018-01-19 武汉斗鱼网络科技有限公司 Image authentication model training method, device and readable storage medium storing program for executing
CN108229533A (en) * 2017-11-22 2018-06-29 深圳市商汤科技有限公司 Image processing method, model pruning method, device and equipment
CN108416440A (en) * 2018-03-20 2018-08-17 上海未来伙伴机器人有限公司 A kind of training method of neural network, object identification method and device

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111523640A (en) * 2020-04-09 2020-08-11 北京百度网讯科技有限公司 Training method and device of neural network model
CN111523640B (en) * 2020-04-09 2023-10-31 北京百度网讯科技有限公司 Training method and device for neural network model
US11888705B2 (en) 2020-04-30 2024-01-30 EMC IP Holding Company LLC Method, device, and computer program product for processing data
CN113592059A (en) * 2020-04-30 2021-11-02 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for processing data
WO2021218095A1 (en) * 2020-04-30 2021-11-04 深圳市商汤科技有限公司 Image processing method and apparatus, and electronic device and storage medium
CN112085281B (en) * 2020-09-11 2023-03-10 支付宝(杭州)信息技术有限公司 Method and device for detecting safety of business prediction model
CN112085281A (en) * 2020-09-11 2020-12-15 支付宝(杭州)信息技术有限公司 Method and device for detecting safety of business prediction model
CN114422380B (en) * 2020-10-09 2023-06-09 维沃移动通信有限公司 Neural network information transmission method, device, communication equipment and storage medium
CN114422380A (en) * 2020-10-09 2022-04-29 维沃移动通信有限公司 Neural network information transmission method, device, communication equipment and storage medium
WO2022126902A1 (en) * 2020-12-18 2022-06-23 平安科技(深圳)有限公司 Model compression method and apparatus, electronic device, and medium
CN114692816A (en) * 2020-12-31 2022-07-01 华为技术有限公司 Processing method and equipment of neural network model
CN114692816B (en) * 2020-12-31 2023-08-25 华为技术有限公司 Processing method and equipment of neural network model
WO2023279975A1 (en) * 2021-07-06 2023-01-12 华为技术有限公司 Model processing method, federated learning method, and related device
CN114925821A (en) * 2022-01-05 2022-08-19 华为技术有限公司 Compression method of neural network model and related system
CN114925821B (en) * 2022-01-05 2023-06-27 华为技术有限公司 Compression method and related system of neural network model

Also Published As

Publication number Publication date
WO2020042658A1 (en) 2020-03-05

Similar Documents

Publication Publication Date Title
CN110874550A (en) Data processing method, device, equipment and system
CN111008640A (en) Image recognition model training and image recognition method, device, terminal and medium
KR20200052806A (en) Operating method of deep learning based climate change prediction system
CN111967594A (en) Neural network compression method, device, equipment and storage medium
EP3620982B1 (en) Sample processing method and device
CN111523640A (en) Training method and device of neural network model
CN112905997B (en) Method, device and system for detecting poisoning attack facing deep learning model
CN110096979B (en) Model construction method, crowd density estimation method, device, equipment and medium
CN113095370A (en) Image recognition method and device, electronic equipment and storage medium
CN109446897B (en) Scene recognition method and device based on image context information
CN110751191A (en) Image classification method and system
CN115953643A (en) Knowledge distillation-based model training method and device and electronic equipment
CN107392311A (en) The method and apparatus of sequence cutting
US20210166065A1 (en) Method and machine readable storage medium of classifying a near sun sky image
CN112597919A (en) Real-time medicine box detection method based on YOLOv3 pruning network and embedded development board
CN117671597B (en) Method for constructing mouse detection model and mouse detection method and device
CN113887330A (en) Target detection system based on remote sensing image
CN111339952B (en) Image classification method and device based on artificial intelligence and electronic equipment
CN117095460A (en) Self-supervision group behavior recognition method and system based on long-short time relation predictive coding
CN117152528A (en) Insulator state recognition method, insulator state recognition device, insulator state recognition apparatus, insulator state recognition program, and insulator state recognition program
CN112288702A (en) Road image detection method based on Internet of vehicles
CN110490876B (en) Image segmentation method based on lightweight neural network
CN116826734A (en) Photovoltaic power generation power prediction method and device based on multi-input model
CN114219051B (en) Image classification method, classification model training method and device and electronic equipment
CN116563809A (en) Road disease identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200310

WD01 Invention patent application deemed withdrawn after publication