CN111049836A - Data processing method, electronic device and computer readable storage medium - Google Patents

Data processing method, electronic device and computer readable storage medium Download PDF

Info

Publication number
CN111049836A
CN111049836A CN201911293638.8A CN201911293638A CN111049836A CN 111049836 A CN111049836 A CN 111049836A CN 201911293638 A CN201911293638 A CN 201911293638A CN 111049836 A CN111049836 A CN 111049836A
Authority
CN
China
Prior art keywords
sub
result
data
network
compressing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911293638.8A
Other languages
Chinese (zh)
Inventor
马原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Pengsi Technology Co Ltd
Original Assignee
Beijing Pengsi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Pengsi Technology Co Ltd filed Critical Beijing Pengsi Technology Co Ltd
Priority to CN201911293638.8A priority Critical patent/CN111049836A/en
Publication of CN111049836A publication Critical patent/CN111049836A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content
    • H04L67/5651Reducing the amount or size of exchanged application data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols

Abstract

The embodiment of the invention discloses a data processing method, electronic equipment and a computer readable storage medium. The method comprises the following steps: acquiring data to be processed, inputting the data to be processed into a first sub-network deployed in the client, and acquiring a sub-result; compressing the sub-result; sending the compressed sub-result to a server side, so that the server side decompresses the sub-result, and inputs the decompressed sub-result into a second sub-network deployed in the server side to obtain a processing result; and receiving a processing result returned by the server. According to the data processing method provided by the embodiment of the invention, during data processing, the sub-networks deployed at the client and the server are sequentially input to obtain the processing result, so that data leakage can be prevented, and the data security in the operation process of the neural network is improved. And the sub-results output by the first sub-network are compressed and then sent to the server, so that the transmission quantity of data can be reduced, the transmission bandwidth is saved, and the time delay is reduced.

Description

Data processing method, electronic device and computer readable storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, an electronic device, and a computer-readable storage medium.
Background
Neural networks, particularly deep neural networks, are widely used in many fields such as image processing, natural language processing, speech recognition, and the like, and are used to perform various tasks such as image classification, semantic understanding, character recognition, and the like.
Disclosure of Invention
Embodiments of the present invention provide a data processing method, an electronic device, and a computer-readable storage medium, which can improve data security during a neural network operation process, reduce a transmission amount of data between a client and a server, and save a transmission bandwidth, thereby reducing a time delay.
In a first aspect, an embodiment of the present invention provides a data processing method, where the method is executed by a client, and includes:
acquiring data to be processed, inputting the data to be processed into a first sub-network deployed in the client, and acquiring a sub-result;
compressing the sub-result;
sending the compressed sub-result to a server side, so that the server side decompresses the sub-result, and inputs the decompressed sub-result into a second sub-network deployed in the server side to obtain a processing result;
receiving a processing result returned by the server; wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network, and the first sub-network comprises at least one convolutional layer and a pooling layer.
In a second aspect, an embodiment of the present invention further provides a data processing method, where the method is executed by a server and includes:
receiving a sub-result sent by a client; the sub-result is obtained by the client inputting the data to be processed into a first sub-network deployed in the client and performing compression processing;
decompressing the sub-result, and inputting the decompressed sub-result into a second sub-network deployed in the server to obtain a processing result;
sending the processing result to a client; wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network, and the first sub-network comprises at least one convolutional layer and a pooling layer.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
a processor;
a memory for storing a computer program;
when executed by the processor, the computer program causes the processor to implement the data processing method according to the embodiment of the present invention.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the data processing method according to the embodiment of the present invention.
The embodiment of the invention comprises the steps of firstly obtaining data to be processed, inputting the data to be processed into a first sub-network deployed in a client side, and obtaining sub-results; then compressing the sub-result; then, the compressed sub-results are sent to the server side, so that the server side decompresses the sub-results, and inputs the decompressed sub-results into a second sub-network deployed in the server side to obtain a processing result; finally, receiving a processing result returned by the server; wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network, and the first sub-network comprises at least one convolutional layer and a pooling layer. According to the data processing method provided by the embodiment of the invention, the neural network is divided into two sub-networks with different functions, the two sub-networks are respectively deployed at the client and the server, and the sub-networks deployed at the client and the server are sequentially input during data processing to obtain a processing result, so that data leakage can be prevented, and the data security in the operation process of the neural network is improved. And the sub-results output by the first sub-network are compressed and then sent to the server, so that the transmission quantity of data can be reduced, the transmission bandwidth is saved, and the time delay is reduced.
Drawings
FIG. 1 is a flow chart of a data processing method in an embodiment of the invention;
FIG. 2 is an exemplary diagram of a neural network split in an embodiment of the present invention;
FIG. 3 is a flow chart of a method of data processing in an embodiment of the invention;
FIG. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 6 is a block diagram of a data processing system in an embodiment of the invention;
fig. 7 is a schematic structural diagram of a computer device in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings of the embodiments of the present disclosure. It is to be understood that the described embodiments are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the disclosure without any inventive step, are within the scope of protection of the disclosure.
Unless otherwise defined, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in this disclosure is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. Also, the use of the terms "a," "an," or "the" and similar referents do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
In the technology known to the inventor, a neural network is generally deployed at a server, and a client is used for sending data such as images and texts to be processed to the server and receiving a processing result (such as a classification result) transmitted by the server. In this way, not only is there a risk of data leakage; furthermore, the more major problem is that such data transmission process results in high requirements on data transmission bandwidth, and the data transmission duration reduces the processing efficiency of the whole neural network.
To maintain the following description of the embodiments of the present disclosure clear and concise, a detailed description of known functions and known components have been omitted from the present disclosure.
In the embodiments described below, the neural network may be selected according to the function to be performed. The CNN convolutional neural network and various specific implementations thereof such as a full convolutional neural network (FCN), a segmented network (SegNet) and the like can be adopted; it may also be a recurrent neural network RNN and its various implementations such as long short term memory networks (LSTM), gated cyclic units (GRU); various other neural network structures are also possible, such as an optical flow neural network FlowNet or the like.
The constituent structure in a neural network can be understood by those skilled in the art. For example, the convolutional layer may be used to perform a convolution operation, extracting feature information of an input image (e.g., of size 227 × 227) to obtain a feature map (e.g., of size 13 × 13); the pooling layer may perform a pooling operation on the input image, such as a max-pooling (max-pooling) method, a mean-pooling (mean-pooling) method, etc.; the activation layer introduces nonlinear factors through activation functions, such as adopting correction unit (ReLU, Leaky-ReLU, P-ReLU, R-ReLU) functions, S-type functions (Sigmoid functions) or hyperbolic tangent functions (tanh functions) and the like. The random deactivation layer (Dropout) is used to alleviate the over-fitting problem, and may be set to 0.4, 0.5, or the like, for example. The fully-connected layer (also called point-packed layer) is used for converting the feature map output by convolution into a one-dimensional vector.
In order to make the neural network have a desired function, for example, a classification function, a LR classifier, a Softmax classifier, or the like may be connected to an output of the neural network to perform a classification function or the like.
Referring to fig. 1, in at least one embodiment of the present invention, a data processing method is disclosed, which inputs data into a neural network for processing, and which may be performed by a data processing apparatus, which may be composed of hardware and/or software, and may be generally integrated in a device having a data processing function, which is a device of a client. As shown in fig. 1, the method specifically includes the following steps:
step 110, obtaining data to be processed, inputting the data to be processed into a first sub-network deployed in the client, and obtaining a sub-result.
Wherein the first sub-network comprises at least one set of convolutional layers and pooling layers. The data to be processed may be data to be analyzed calculated based on a neural network, and the final processing result is an analysis result, for example, the data may be image data for face recognition, voice data for voice recognition, or the like. The mode of acquiring the data to be processed may be data acquired by a camera or a microphone of the client and subjected to preprocessing, or data stored locally, or data sent by other terminals.
In this embodiment, the neural network is first split into two sub-networks including different functional layers, namely a first sub-network and a second sub-network. A first sub-network is deployed at a client, a second sub-network is deployed at a server, and the output of the first sub-network is the input of the second sub-network.
In this embodiment, the first sub-network comprises at least one convolutional layer and a pooling layer.
For example, the step length (stride) of the convolutional layer and the pooling layer is larger than 1, and the size of the characteristic data can be rapidly reduced after the data to be processed passes through the convolutional layer and the pooling layer, so that the data is reduced in the dimension.
For example, assuming that the first sub-network is input with image data having a size of 227 width and 227 height, channel 3, expressed using integer (char) data (1 byte), the amount of data is 227 x 3 x 1 — 154587 bytes. Setting the step size of the convolutional layer in the first subnetwork to 4, the step size of the pooling layer to 2, the output channel of the pooling layer to 32, and the output result to be represented by floating point type, the size of the output data is 27 channels by 32, and the data amount is 27 bytes by 32 bytes to 93312 bytes. The ratio of the output data amount to the original data amount was 93312/154587-60.36%.
Optionally, the first sub-network further includes an active layer, and the active layer is disposed behind the convolutional layer.
For example, the activation layer may be disposed between the convolutional layer and the pooling layer, or may be disposed after the pooling layer.
The activation layer sets the data smaller than 0 in the input data to 0, and the data larger than 0 is kept unchanged, so that the output data has sparse characteristics. Therefore, subsequent compression processing is facilitated, and a large amount of data with the value of 0 is compressed to reduce the data volume.
The data characteristics after the activation layer processing have the sparse characteristic, so that a large amount of 0 data is contained in the pooling result output by the pooling layer, and the pooling result can be subjected to sparse compression, for example, sparse compression can be performed by adopting sparse matrix compression methods such as a ternary array, a row pointer linked list, a cross linked list and the like.
The sub-result may be feature data (feature map) obtained by feature extraction in the first sub-network. Specifically, after the client acquires the data to be processed, the data to be processed is input into a first sub-network deployed in the client for analysis, and a sub-result is obtained. The sub-results are significantly reduced in data size relative to the original data.
And step 120, compressing the sub-result.
Optionally, the compression processing on the sub-result may be: and compressing the sub-result by adopting a data quantization method and/or a source coding method.
For example, the data quantization method may be: searching the maximum value in the sub-results; determining an exponential digit according to the maximum value; and converting the floating point number in the sub-result into an integer according to the exponent digit.
Wherein the base number is an index of 2. Illustratively, assuming a maximum value of 15, the number of bits in the exponent is 4, i.e., 24. The calculation formula for converting the floating point number in the sub-result into the integer according to the expression digit of the exponent is as follows: i ═ Int (Float × 2^ (n), where I denotes the quantized data, (Int) is the rounding operation, Float denotes the original floating point number, and n is the exponent bit number.
Illustratively, the float data in the sub-result is subjected to 8-bit quantization compression, the data type in the sub-result is converted from a floating point number to 8-bit integer data (32-bit float to int 8), and the space occupied by a single data point is reduced from 4 bytes to 1 byte. The data size of the original data is 227 × 3 × 1 — 154587, the data size of the sub result before compression is 27 × 32 × 4 — 93312 bytes, the data size of the sub result after compression is 27 × 32 — 1 — 23328 bytes, and the ratio of the data phase after compression to the original data is 23328/154587 — 15.09%.
For example, the source coding method is at least one selected from lossless source coding methods such as arithmetic coding, feinuo coding, huffman coding, shannon coding, and the like.
In an embodiment of the present invention, Huffman Coding (Huffman Coding) is adopted to compress the sub-results, and a codeword with the shortest average length of the anopheic headers is constructed according to the occurrence probability of the characters, and the specific algorithm thereof is not described herein again.
In this embodiment, the sub-result may be compressed only by a quantization method, only by a source coding method, or by both methods.
Step 130, sending the compressed sub-result to the server, so that the server decompresses the sub-result, and inputs the decompressed sub-result into a second sub-network deployed in the server, to obtain a processing result.
The way of decompressing the sub-result by the server may be according to the compression method in the foregoing embodiment.
For example, the foregoing compression process employs a data quantization method, such as float 32-int 8, and the decompression method may be to reduce int 8 to float 32. The neural network in the related art has provided several function functions to perform the function, such as tf.cast () function in tentor flow, tentor 1.type _ as (tentor 2) function in PyTorch, and the like.
For example, the foregoing compression process employs source coding to perform compression, and decodes the compressed data. If Huffman coding is adopted for compression, Huffman decoding is adopted for decompression.
For example, if the compression process is performed by using a sparse method, the compressed data is recovered by using a method corresponding to sparse storage.
And step 140, receiving a processing result returned by the server.
Wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network. In this embodiment, the functional layers of the neural network include: one or more of a convolutional layer, an active layer, a pooling layer, a random deactivation layer, and a fully-connected layer; the number of each functional layer is one or more. When the neural network is split, any network layer can be used as a split node, and the split node is irrelevant to the function of the layer.
Alternatively, the way to split the neural network into two sub-networks may be: the first sub-network comprises an input layer and one or more functional layers and compresses the output result of the last functional layer; the second sub-network comprises one or more functional layers and an output layer, wherein the input sub-result of the first functional layer is decompressed, and the output layer is connected with a classifier and the like to output a processing result.
Exemplarily, fig. 2 is an exemplary diagram of splitting the neural network in the present embodiment. As shown in fig. 2, the neural network is split into two sub-networks. The first sub-network comprises an input layer, a convolution layer, an activation layer and a pooling layer, the second sub-network comprises one or more functional layers (convolution layer, activation layer and pooling layer), one or more functional layers (point lamination layer, convolution layer and random inactivation layer) and point lamination, and the classifier is connected to take the classification result as the output processing result.
In some embodiments, after the neural network is split, the number of functional layers included in the first sub-network is less than the number of functional layers included in the second sub-network. The advantage of this is that most of the calculation amount of the data for setting the neural network can be borne by the server, thereby reducing the calculation pressure of the client.
According to the technical scheme of the embodiment, the neural network is divided into two sub-networks with different functions, the sub-networks are respectively deployed at the client and the server, and the sub-networks deployed at the client and the server are sequentially input during data processing to obtain a processing result, so that data leakage can be prevented, and the data security in the operation process of the neural network is improved. And the sub-results output by the first sub-network are compressed and then sent to the server, so that the transmission quantity of data can be reduced, the transmission bandwidth is saved, and the time delay is reduced.
Referring to fig. 3, the present embodiment is a flowchart of a data processing method, where the present embodiment is applicable to a case where data is input into a neural network for processing, and the method may be executed by a data processing apparatus, where the apparatus may be composed of hardware and/or software, and may be generally integrated into a device with a data processing function, where the device may be an electronic device such as a server or a server cluster. As shown in fig. 3, the method specifically includes the following steps:
step 210, receiving a sub-result sent by a client;
and the sub-result is obtained by inputting the data to be processed into a first sub-network deployed in the client by the client and performing compression processing. The first sub-network comprises at least one convolutional layer and a pooling layer. The data to be processed may be data to be analyzed calculated based on a neural network, and the final processing result is an analysis result, for example, the data may be image data for face recognition, voice data for voice recognition, or the like. The mode of acquiring the data to be processed may be data acquired by a camera or a microphone of the client and subjected to preprocessing, or data stored locally, or data sent by other terminals.
In this embodiment, the neural network is first split into two sub-networks including different functional layers, which are respectively a first sub-network and a second sub-network. A first sub-network is deployed at a client, a second sub-network is deployed at a server, and the output of the first sub-network is the input of the second sub-network.
In this embodiment, the first sub-network comprises at least one convolutional layer and a pooling layer. For example, the step length (stride) of the convolutional layer and the pooling layer is larger than 1, and the size of the characteristic data can be rapidly reduced after the data to be processed passes through the convolutional layer and the pooling layer, so that the data is reduced in the dimension.
The compression processing mode of the sub-result by the client may be as follows: and compressing the sub-result by adopting a data quantization method and/or a source coding method. For the specific compression process, refer to the above embodiments, and are not described herein again.
Optionally, the first sub-network further includes an active layer, and the active layer is disposed behind the convolutional layer. For example: the activation layer may be disposed between the convolutional layer and the pooling layer, or may be disposed after the pooling layer.
The activation layer sets the data smaller than 0 in the input data to 0, and the data larger than 0 is kept unchanged, so that the output data has sparse characteristics. Therefore, subsequent compression processing is facilitated, and a large amount of data with the value of 0 is compressed to reduce the data volume.
The data characteristics after the activation layer processing have the sparse characteristic, so that a large amount of 0 data is contained in the pooling result output by the pooling layer, and the pooling result can be subjected to sparse compression, for example, sparse compression can be performed by adopting sparse matrix compression methods such as a ternary array, a row pointer linked list, a cross linked list and the like.
The sub-result may be feature data (feature map) obtained by feature extraction in the first sub-network. Specifically, after the client side obtains the data to be processed, the data to be processed is input into a first sub-network deployed in the client side for analysis, a sub-result is obtained, the sub-result is compressed, and finally the compressed sub-result is sent to the server side. The sub-results are significantly reduced in data size relative to the original data.
Step 220, decompressing the sub-result, and inputting the decompressed sub-result into a second sub-network deployed in the server to obtain a processing result.
Wherein the second sub-network and the first sub-network comprise different functional layers. Optionally, the sub-result may be decompressed by: compressing by adopting a data quantization method in response to the sub-result, and restoring quantized data; responding to the sub-result, compressing by adopting a source coding method, and decoding the compressed data; and responding to the sub-result, compressing by adopting a thinning method, and recovering the compressed data.
For example, the foregoing compression process employs a data quantization method, such as float 32-int 8, and the decompression method may be to reduce int 8 to float 32. The neural network in the related art has provided several function functions to perform the function, such as tf.cast () function in tentor flow, tentor 1.type _ as (tentor 2) function in PyTorch, and the like.
For example, the foregoing compression process employs source coding to perform compression, and decodes the compressed data. If Huffman coding is adopted for compression, Huffman decoding is adopted for decompression.
For example, if the compression process is performed by using a sparse method, the compressed data is recovered by using a method corresponding to sparse storage.
Step 230, sending the processing result to the client.
Wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network. The way of splitting the neural network into two sub-networks can be seen in the above embodiments, and will not be described herein. In this embodiment, after the neural network is split, the number of network layers included in the first sub-network is smaller than the number of network layers included in the second sub-network. The advantage of this is that most of the calculation amount of the data for setting the neural network can be borne by the server, thereby reducing the calculation pressure of the client.
According to the technical scheme of the embodiment, the neural network is divided into two sub-networks with different functions, the sub-networks are respectively deployed at the client and the server, and the sub-networks deployed at the client and the server are sequentially input during data processing to obtain a processing result, so that data leakage can be prevented, and the data security in the operation process of the neural network is improved. And the sub-results output by the first sub-network are compressed and then sent to the server, so that the transmission quantity of data can be reduced, the transmission bandwidth is saved, and the time delay is reduced.
Referring to fig. 4, a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention is provided. The device is arranged at the client, and as shown in fig. 4, the device includes: a sub-result obtaining module 410, a compressing module 420, a sub-result sending module 430 and a processing result receiving module 440.
A sub-result obtaining module 410, configured to obtain data to be processed, and input the data to be processed into a first sub-network deployed in the client, so as to obtain a sub-result; the first sub-network comprises at least one convolutional layer and a pooling layer;
a compressing module 420, configured to perform compression processing on the sub-result;
the sub-result sending module 430 is configured to send the compressed sub-result to the server, so that the server decompresses the sub-result, and inputs the decompressed sub-result into a second sub-network deployed in the server to obtain a processing result;
a processing result receiving module 440, configured to receive a processing result returned by the server; wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network.
Optionally, the compressing module 420 is further configured to:
and compressing the sub-result by adopting a data quantization compression mode and/or a source coding method.
Optionally, the compressing module 420 is further configured to:
searching the maximum value in the sub-results;
determining an exponential digit according to the maximum value;
and converting the floating point number in the sub-result into an integer according to the exponent digit.
Optionally, compressing the sub-result by using a source coding method, including: and compressing the sub-result by adopting a Huffman coding method.
Optionally, the first sub-network further includes an active layer disposed behind the convolutional layer; the compression module 420 is further configured to: and compressing the sub-result by adopting a sparse compression method.
Optionally, the step size of both the convolutional layer and the pooling layer in the first sub-network is greater than 1.
Optionally, the number of functional layers included in the first sub-network is smaller than the number of functional layers included in the second sub-network.
Referring to fig. 5, a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention is provided. As shown in fig. 5, the apparatus includes: a sub-result receiving module 510, a decompression module 520 and a processing result transmitting module 530.
A sub-result receiving module 510, configured to receive a sub-result sent by a client; the sub-result is obtained by the client inputting the data to be processed into a first sub-network deployed in the client and performing compression processing; the first sub-network comprises at least one convolutional layer and a pooling layer;
a decompression module 520, configured to decompress the sub-result, and input the decompressed sub-result into a second sub-network deployed in the server to obtain a processing result;
a processing result sending module 530, configured to send the processing result to the client; wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network.
Optionally, the decompressing module 520 is further configured to:
compressing by adopting a data quantization method in response to the sub-result, and restoring quantized data;
responding to the sub-result, compressing by adopting a source coding method, and decoding the compressed data;
and responding to the sub-result, compressing by adopting a thinning method, and recovering the compressed data.
The device can execute the methods provided by all the embodiments of the invention, and has corresponding functional modules and beneficial effects for executing the methods. For details not described in detail in this embodiment, reference may be made to the methods provided in all the foregoing embodiments of the present invention.
Referring to fig. 6, a schematic structural diagram of a data processing system according to an embodiment of the present invention is provided. As shown in fig. 6, the system includes: client and server.
The client is deployed with a first sub-network and used for acquiring data to be processed and inputting the data to be processed into the first sub-network to obtain a sub-result; compressing the sub-result, and sending the compressed sub-result to a server;
the server is provided with a second sub-network and used for decompressing the received sub-results, inputting the decompressed sub-results into the second sub-network, obtaining a processing result and sending the processing result to the client; wherein the first sub-network and the second sub-network comprise different functional layers in the same neural network, and the first sub-network comprises at least one convolutional layer and a pooling layer.
Referring to fig. 7, a schematic structural diagram of an electronic device according to an embodiment of the present invention is provided. FIG. 7 illustrates a block diagram of an electronic device 312 suitable for use in implementing embodiments of the present invention. The electronic device 312 shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention. Device 312 is a computing device for typical data processing functions.
As shown in fig. 7, electronic device 312 is in the form of a general purpose computing device. The components of the electronic device 312 may include, but are not limited to: one or more processors 316, a storage device 328, and a bus 318 that couples the various system components including the storage device 328 and the processors 316.
Bus 318 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 312 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 312 and includes both volatile and nonvolatile media, removable and non-removable media.
Storage 328 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 330 and/or cache Memory 332. The electronic device 312 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 334 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 7, and commonly referred to as a "hard drive"). Although not shown in FIG. 7, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a Compact disk-Read Only Memory (CD-ROM), a Digital Video disk (DVD-ROM), or other optical media) may be provided. In these cases, each drive may be connected to bus 318 by one or more data media interfaces. Storage 328 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
Program 336 having a set (at least one) of program modules 326 may be stored, for example, in storage 328, such program modules 326 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which may comprise an implementation of a network environment, or some combination thereof. Program modules 326 generally carry out the functions and/or methodologies of embodiments of the invention as described herein.
Electronic device 312 may also communicate with one or more external devices 314 (e.g., keyboard, pointing device, camera, display 324, etc.), with one or more devices that enable a user to interact with electronic device 312, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 312 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 322. Also, the electronic device 312 may communicate with one or more networks (e.g., a Local Area Network (LAN), Wide Area Network (WAN), and/or a public Network, such as the internet) via the Network adapter 320. As shown, a network adapter 320 communicates with the other modules of the electronic device 312 via the bus 318. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 312, including but not limited to: microcode, device drivers, Redundant processing units, external disk drive Arrays, disk array (RAID) systems, tape drives, and data backup storage systems, to name a few.
The processor 316 executes various functional applications and data processing by executing programs stored in the storage 328, for example, to implement the data processing methods provided by the above-described embodiments of the present invention.
Embodiments of the present invention also provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data processing method provided by the embodiments of the present invention.
Of course, the computer program stored on the computer-readable storage medium provided by the embodiments of the present invention is not limited to the method operations described above, and may also perform related operations in the data processing method provided by any embodiments of the present invention.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A data processing method, performed by a client, comprising:
acquiring data to be processed, inputting the data to be processed into a first sub-network deployed in the client, and acquiring a sub-result; wherein the first subnetwork comprises at least one convolutional layer and a pooling layer;
compressing the sub-result;
sending the compressed sub-result to a server side, so that the server side decompresses the sub-result, and inputs the decompressed sub-result into a second sub-network deployed in the server side to obtain a processing result;
receiving a processing result returned by the server;
wherein the first and second sub-networks comprise different functional layers in the same neural network.
2. The method of claim 1, wherein compressing the sub-result comprises:
and compressing the sub-result by adopting a data quantization and/or source coding method.
3. The method of claim 2, wherein compressing the sub-result by data quantization comprises:
finding a maximum value in the sub-results;
determining an exponential digit according to the maximum value;
and converting the floating point number in the sub-result into an integer according to the exponent digit.
4. The method of claim 2, wherein compressing the sub-result by source coding comprises: and compressing the sub-result by adopting a Huffman coding method.
5. The method of any of claims 1-4, wherein the first subnetwork further comprises an active layer disposed after the convolutional layer; and compressing the sub-result further comprises compressing the sub-result by a sparse compression method.
6. Method according to any of claims 1-4, wherein the step size of the convolutional layer and the pooling layer in the first subnetwork is larger than 1.
7. A data processing method is characterized in that the method is executed by a server side and comprises the following steps:
receiving a sub-result sent by a client; the sub-result is obtained by the client inputting the data to be processed into a first sub-network deployed in the client and performing compression processing; wherein the first subnetwork comprises at least one convolutional layer and a pooling layer;
decompressing the sub-result, and inputting the decompressed sub-result into a second sub-network deployed in the server to obtain a processing result;
sending the processing result to a client; wherein the first and second sub-networks comprise different functional layer neural networks in the same neural network.
8. The method of claim 7, wherein decompressing the sub-result comprises:
compressing by adopting a data quantization method in response to the sub-result, and restoring quantized data;
responding to the sub-result, compressing by adopting a source coding method, and decoding the compressed data;
and responding to the sub-result, compressing by adopting a thinning method, and recovering the compressed data.
9. An electronic device, comprising:
a processor;
a memory for storing a computer program;
when executed by the processor, cause the processor to implement the data processing method of any of claims 1-8.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the data processing method of any one of claims 1 to 8.
CN201911293638.8A 2019-12-16 2019-12-16 Data processing method, electronic device and computer readable storage medium Pending CN111049836A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911293638.8A CN111049836A (en) 2019-12-16 2019-12-16 Data processing method, electronic device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911293638.8A CN111049836A (en) 2019-12-16 2019-12-16 Data processing method, electronic device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN111049836A true CN111049836A (en) 2020-04-21

Family

ID=70236642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911293638.8A Pending CN111049836A (en) 2019-12-16 2019-12-16 Data processing method, electronic device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111049836A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159297A (en) * 2021-04-29 2021-07-23 上海阵量智能科技有限公司 Neural network compression method and device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160086078A1 (en) * 2014-09-22 2016-03-24 Zhengping Ji Object recognition with reduced neural network weight precision
CN109685202A (en) * 2018-12-17 2019-04-26 腾讯科技(深圳)有限公司 Data processing method and device, storage medium and electronic device
CN109901814A (en) * 2019-02-14 2019-06-18 上海交通大学 Customized floating number and its calculation method and hardware configuration
CN110263910A (en) * 2018-03-12 2019-09-20 罗伯特·博世有限公司 For storing the method and apparatus for efficiently running neural network
CN110489428A (en) * 2019-08-26 2019-11-22 上海燧原智能科技有限公司 Multi-dimensional sparse matrix compression method, decompression method, device, equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160086078A1 (en) * 2014-09-22 2016-03-24 Zhengping Ji Object recognition with reduced neural network weight precision
CN110263910A (en) * 2018-03-12 2019-09-20 罗伯特·博世有限公司 For storing the method and apparatus for efficiently running neural network
CN109685202A (en) * 2018-12-17 2019-04-26 腾讯科技(深圳)有限公司 Data processing method and device, storage medium and electronic device
CN109901814A (en) * 2019-02-14 2019-06-18 上海交通大学 Customized floating number and its calculation method and hardware configuration
CN110489428A (en) * 2019-08-26 2019-11-22 上海燧原智能科技有限公司 Multi-dimensional sparse matrix compression method, decompression method, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159297A (en) * 2021-04-29 2021-07-23 上海阵量智能科技有限公司 Neural network compression method and device, computer equipment and storage medium
CN113159297B (en) * 2021-04-29 2024-01-09 上海阵量智能科技有限公司 Neural network compression method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US20190190538A1 (en) Accelerator hardware for compression and decompression
Gan et al. Compressing the CNN architecture for in-air handwritten Chinese character recognition
US9966971B2 (en) Character conversion
CN113327599B (en) Voice recognition method, device, medium and electronic equipment
WO2020207410A1 (en) Data compression method, electronic device, and storage medium
US20180041224A1 (en) Data value suffix bit level compression
JP2022525897A (en) Methods and equipment for compression / decompression of neural network models
US11803693B2 (en) Text compression with predicted continuations
CN114116635A (en) Parallel decompression of compressed data streams
WO2022028197A1 (en) Image processing method and device thereof
CN111091182A (en) Data processing method, electronic device and storage medium
CN116978011B (en) Image semantic communication method and system for intelligent target recognition
CN114529741A (en) Picture duplicate removal method and device and electronic equipment
CN114610650A (en) Memory compression method and device, storage medium and electronic equipment
CN114266230A (en) Text structuring processing method and device, storage medium and computer equipment
CN111049836A (en) Data processing method, electronic device and computer readable storage medium
US20210201134A1 (en) Data output method, data acquisition method, device, and electronic apparatus
WO2023159820A1 (en) Image compression method, image decompression method, and apparatuses
CN109474826B (en) Picture compression method and device, electronic equipment and storage medium
CN114970470A (en) Method and device for processing file information, electronic equipment and computer readable medium
CN112800183A (en) Content name data processing method and terminal equipment
CN115409150A (en) Data compression method, data decompression method and related equipment
CN113591987B (en) Image recognition method, device, electronic equipment and medium
CN113591983B (en) Image recognition method and device
CN117376634B (en) Short video music distribution method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200421

RJ01 Rejection of invention patent application after publication