WO2018077293A1 - 数据传输方法和系统、电子设备 - Google Patents
数据传输方法和系统、电子设备 Download PDFInfo
- Publication number
- WO2018077293A1 WO2018077293A1 PCT/CN2017/108450 CN2017108450W WO2018077293A1 WO 2018077293 A1 WO2018077293 A1 WO 2018077293A1 CN 2017108450 W CN2017108450 W CN 2017108450W WO 2018077293 A1 WO2018077293 A1 WO 2018077293A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- node
- matrix
- sparse
- deep learning
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24143—Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Definitions
- the present application relates to deep learning techniques, and more particularly to data transmission methods and systems, and electronic devices.
- the deep learning training system is a computing system that acquires a deep learning model by training input data.
- the deep learning training system needs to process a large amount of training data.
- the ImageNet data set opened by the Stanford University Computer Vision Laboratory contains more than 14 million high-precision images.
- single-node deep learning training systems often take weeks or even months to complete operations due to their computational power and memory limitations. In this case, the distributed deep learning training system has received extensive attention in industry and academia.
- a typical distributed deep learning training system typically uses a distributed computing framework to run a gradient descent algorithm. .
- the network traffic generated by gradient aggregation and parameter broadcasts is usually proportional to the size of the deep learning model.
- the new deep learning model is growing in size.
- the AlexNet model contains more than 60 million parameters, and the VGG-16 model has hundreds of millions of parameters. Therefore, a large amount of network traffic is generated during the deep learning training, which is subject to the network bandwidth and other conditions. Communication time becomes one of the performance bottlenecks of the distributed deep learning training system.
- the embodiment of the present application provides a data transmission scheme.
- an embodiment of the present application provides a data transmission method, including:
- performing sparse processing on at least a portion of the first data includes: comparing at least a portion of the first data to a given filtering threshold, and filtering out less than the at least portion A portion of the filtering threshold, wherein the filtering threshold decreases as the number of training iterations of the deep learning model increases.
- the method before performing the thinning process on the at least part of the first data, the method further includes: randomly determining a portion of the first data as the at least part; and thinning at least part of the determined first data deal with.
- the sending, to the at least one other node, the at least part of the first data after the thinning process comprises: compressing the at least part of the first data after performing sparse processing; and sending the compression to the at least one other node After the first data.
- the method according to the first aspect of the present invention further includes: acquiring, by the at least one other node, second data for parameter updating the deep learning model trained by the distributed system; The two data updates the parameters of the deep learning model.
- acquiring, by the at least one other node, second data for performing parameter update on the deep learning model trained by the distributed system including: receiving and decompressing, sending, by the at least one other node, the compressed Second data for parameter updating of the deep learning model trained by the distributed system.
- the first data includes: a gradient matrix obtained by any training process calculation during iterative training of the deep learning model; and/or any training during iterative training of the deep learning model An old parameter, a parameter difference between the new parameter obtained by performing the parameter update based on at least one parameter update of the deep learning model for the distributed system training sent by the at least one other node Value matrix.
- performing sparse processing on at least part of the first data includes: selecting, from the gradient matrix, a first portion whose absolute values are respectively smaller than the filtering threshold a matrix element; randomly selecting a second partial matrix element from the gradient matrix; setting a value of a matrix element of the gradient matrix that belongs to the first partial matrix element and the second partial matrix element to 0, to obtain a sparse gradient matrix Transmitting, to the at least one other node, the first data that is at least partially subjected to the sparse processing, comprising: compressing the sparse gradient matrix into a character string; and transmitting the character string to the at least one other node through a network.
- the first data includes the parameter difference matrix
- performing sparse processing on at least part of the first data including: selecting absolute values from the parameter difference matrix to be smaller than the filtering respectively a third partial matrix element of the threshold; randomly selecting a fourth partial matrix element from the parameter difference matrix; and a matrix of the parameter difference matrix that belongs to the third partial matrix element and the fourth partial matrix element
- the value of the element is set to 0, and a sparse parameter difference matrix is obtained
- sending the at least one portion of the first data after the sparse processing to the at least one other node comprises: compressing the sparse parameter difference matrix into a string; Sending the string to the at least one other node.
- a data transmission system including:
- a data determining module configured to determine first data that is to be sent by any node in the distributed system to at least one other node for parameter updating the deep learning model trained by the distributed system
- a sparse processing module configured to perform sparse processing on at least part of the first data
- a data sending module configured to send, to the at least one other node, the first data that is at least partially subjected to the sparse processing.
- the sparse processing module includes: a filtering submodule, configured to compare at least part of the first data with a given filtering threshold, and filter out the filtering threshold from the at least part And the filtering threshold decreases as the number of training iterations of the deep learning model increases.
- the sparse processing module further includes: a random selection module, configured to randomly determine a portion of the first data as the at least part; a sparse module, configured to perform at least part of the determined first data Sparse processing.
- the data sending module includes: a compression submodule, configured to compress the first data that is at least partially subjected to sparse processing; and a sending submodule, configured to send the compressed first to the at least one other node data.
- the system further includes: a data acquiring module, configured to acquire second data sent by the at least one other node for parameter updating the deep learning model of the distributed system training; and an update module, Updating the parameters of the deep learning model based on at least the second data.
- a data acquiring module configured to acquire second data sent by the at least one other node for parameter updating the deep learning model of the distributed system training
- an update module Updating the parameters of the deep learning model based on at least the second data.
- the data obtaining module includes: a receiving and decompressing submodule, configured to receive and decompress a second parameter that is sent by the at least one other node and configured to perform parameter update on the deep learning model trained by the distributed system data.
- a receiving and decompressing submodule configured to receive and decompress a second parameter that is sent by the at least one other node and configured to perform parameter update on the deep learning model trained by the distributed system data.
- the first data includes: a gradient matrix obtained by any training process calculation during iterative training of the deep learning model; and/or any training during iterative training of the deep learning model An old parameter, a parameter difference between the new parameter obtained by performing the parameter update based on at least one parameter update of the deep learning model for the distributed system training sent by the at least one other node Value matrix.
- the filtering submodule is configured to select, from the gradient matrix, a first partial matrix element whose absolute values are respectively smaller than the filtering threshold; the random selection submodule And a method for randomly selecting a second partial matrix element from the gradient matrix; the sparse sub-module is configured to set a value of a matrix element of the gradient matrix that belongs to the first partial matrix element and the second partial matrix element simultaneously 0, a sparse gradient matrix is obtained; the compression sub-module is configured to compress the sparse gradient matrix into a character string; and the sending sub-module sends the character string to the at least one other node through a network.
- the filtering submodule is configured to select, from the parameter difference matrix, a third partial matrix element whose absolute values are respectively smaller than the filtering threshold; Random selection submodule And randomly selecting a fourth partial matrix element from the parameter difference matrix; the sparse sub-module is configured to use a matrix of the parameter difference matrix that belongs to the third partial matrix element and the fourth partial matrix element simultaneously The value of the element is set to 0 to obtain a sparse parameter difference matrix; the compression submodule is configured to compress the sparse parameter difference matrix into a string; the sending submodule is configured to send the at least one other node through the network Send the string.
- an electronic device including the data transmission system described in any of the embodiments of the present application.
- an electronic device including:
- the processor When the processor is running the data processing system, the units in the data transmission system of any of the embodiments of the present application are executed.
- an electronic device includes: one or more processors, a memory, a communication component, and a communication bus through which the processor, the memory, and the communication component pass The bus completes communication with each other;
- the memory is configured to store at least one executable instruction, the executable instruction causing the processor to perform an operation corresponding to the data transmission method provided by any embodiment of the present application.
- a computer program comprising computer readable code, when a computer readable code is run on a device, a processor in the device performs the above-described An instruction of each step in the data transmission method described in an embodiment.
- a computer readable storage medium for storing computer readable instructions, when the instructions are executed, implementing the data transmission described in any of the above embodiments of the present application. The operation of each step in the method.
- the data transmission method and system, the electronic device, the program and the medium provided by the embodiment of the present application determine parameter updating of a deep learning model for training distributed system training to be sent by at least one other node by any node in the distributed system.
- First data performing sparse processing on at least part of the first data, and transmitting at least one of the first data after the sparse processing to at least one other node.
- Embodiments of the present application can eliminate at least partially unimportant data (such as gradients and/or parameters), reduce network traffic generated by each gradient accumulation and/or parameter broadcast, and shorten training time.
- the application does not need to reduce the communication frequency, and can obtain the latest parameters in time, which can be used in the deep learning training system for communication in each iteration, and also in the system that needs to reduce the communication frequency.
- FIG. 1 is a flow chart of an embodiment of a data transmission method in accordance with the present application.
- FIG. 2 is an exemplary flow chart of gradient filtering in an embodiment of a data transmission method of the present application.
- FIG. 3 is an exemplary flow chart of parameter filtering in an embodiment of a data transmission method of the present application.
- FIG. 4 is a schematic structural diagram of an embodiment of a data transmission system according to the present application.
- FIG. 5 is a schematic structural diagram of another embodiment of a data transmission system according to the present application.
- FIG. 6 is a schematic structural diagram of an embodiment of a node device according to the present application.
- FIG. 7 is a schematic structural diagram of an embodiment of an electronic device according to the present application.
- Embodiments of the present application can be applied to electronic devices such as terminal devices, computer systems, servers, etc., which can operate with numerous other general purpose or special purpose computing system environments or configurations.
- Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, servers, and the like include, but are not limited to: Personal computer system, server computer system, thin client, thick client, handheld or laptop device, microprocessor based system, set top box, programmable consumer electronics, network personal computer, small computer system, mainframe computer system and including A distributed cloud computing technology environment for any of the above systems, and the like.
- Electronic devices such as terminal devices, computer systems, servers, etc., can be described in the general context of computer system executable instructions (such as program modules) being executed by a computer system.
- program modules may include routines, programs, target programs, components, logic, data structures, and the like that perform particular tasks or implement particular abstract data types.
- the computer system/server can be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices that are linked through a communication network.
- program modules may be located on a local or remote computing system storage medium including storage devices.
- FIG. 1 is a flow chart of an embodiment of a data transmission method in accordance with the present application. As shown in FIG. 1, the data transmission method of this embodiment includes:
- step S110 first data for parameter update of a deep learning model for distributed system training to be sent by at least one other node by a node in the distributed system is determined.
- the distributed system therein may be, for example, a cluster composed of a plurality of computing nodes, or may be composed of a plurality of computing nodes and a parameter server.
- the deep learning model therein may include, for example, but not limited to, a neural network (such as a convolutional neural network), and the parameters may be, for example, matrix variables for constructing a deep learning model, and the like.
- step S110 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a data determination module executed by the processor.
- step S120 at least part of the first data is subjected to thinning processing.
- the sparse processing is to remove less important parts from the first data, thereby reducing the network traffic consumed by transmitting the first data and reducing the training time of the deep learning model.
- step S120 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a sparse processing module executed by the processor.
- step S130 the first data after at least part of the thinning process is sent to the at least one other node.
- step S130 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by a data transmitting module executed by the processor.
- the data transmission method of the embodiment of the present application is configured to transmit data for parameter update of a deep learning model running on a computing node between any two computing nodes or a computing node and a parameter server in the distributed deep learning system, which may be ignored. Less important parts of the transmitted data, such as unimportant gradients and/or parameters, which help to reduce network traffic generated during aggregation and broadcast operations, thereby reducing network transmission for each iteration of the calculation. Time, advance And shorten the overall training time of deep learning.
- performing sparse processing on at least a portion of the first data may include comparing at least a portion of the first data to a given filtering threshold and comparing the first data. The portion smaller than the filtering threshold is filtered out at least in part.
- the filtering threshold may be decreased as the number of training iterations of the deep learning model increases, so that the minor parameters are less likely to be selected and eliminated in the later stage of training.
- the method before performing sparse processing on at least part of the first data, may further include: randomly determining a portion of the first data as the at least part; performing sparse processing on at least part of the determined first data .
- the partial data in the first data is sparsely processed, and the remaining data in the first data is not subjected to sparse processing.
- Part of the data that has not been sparsely processed can be sent in a conventional manner.
- the processor may be executed by a processor to call a corresponding instruction stored in the memory, or may be executed by a data acquisition module executed by the processor, for example, by a random selection in a data acquisition module operated by the processor. Modules and sparse submodules are executed.
- the sending, by the at least one other node, the at least part of the first data after performing the thinning process may include: compressing the first data that is at least partially subjected to the sparse processing, and compressing may adopt a general compression algorithm.
- a compression algorithm such as snappy, zlib
- transmitting the compressed first data to the at least one other node may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a data transmitting module executed by the processor, such as a compression sub-module in a data transmission module that may be executed by the processor, respectively.
- the sending submodule is executed.
- the method may further include:
- any one of the foregoing nodes obtains, by the at least one other node, second data for performing parameter update on the deep learning model of the distributed system training, for example, receiving and decompressing, the at least one other node is compressed and sent for The deep learning model of the distributed system training performs the second data of the parameter update.
- the processor may be executed by a processor to call a corresponding instruction stored in the memory, or may be executed by a data acquisition module executed by the processor;
- the parameters of the deep learning model are updated based on at least the second data.
- the timing of the update may occur when any of the above nodes completes the training of the current round during the iterative training of the deep learning model. In an alternative example, this may be performed by the processor invoking the corresponding instruction stored in the memory or by the update module being executed by the processor.
- the first data includes: a gradient matrix obtained by any one of the above-mentioned nodes during the iterative training of the deep learning model.
- the distributed deep learning training system provides raw gradient values (including the gradient values produced by each compute node) as input, and the input gradient can be a single precision value
- a matrix is a matrix variable used to update the parameters of a deep learning model.
- the first data comprises: an old parameter of any one of the above-mentioned nodes training during the iterative training of the deep learning model, and a distribution for transmitting at least according to at least one other node
- the deep learning model of the system training performs a parameter difference matrix between the new parameters obtained by parameter updating the second data obtained by updating the old parameters.
- the distributed deep learning training system replaces the parameters of each compute node cache with newly updated parameters.
- the parameters refer to the matrix variables that construct the deep learning model, which can be a matrix of single-precision values.
- performing sparse processing on at least part of the first data may include: selecting, from the gradient matrix, the first portion that the absolute values are respectively smaller than the filtering threshold a matrix element; randomly selecting a second partial matrix element from the gradient matrix; and setting a value of a matrix element belonging to the first partial matrix element and the second partial matrix element in the gradient matrix to 0, to obtain a sparse gradient matrix.
- sending the at least partially sparsely processed first data to the at least one other node may include: compressing the sparse gradient matrix into a character string; and transmitting the character string to the at least one other node through the network.
- FIG. 2 is an exemplary flow chart of gradient filtering in an embodiment of a data transmission method of the present application. As shown in FIG. 2, this embodiment includes:
- step S210 a number of gradients are selected from the original gradient matrix, for example using an absolute value strategy.
- the absolute value strategy is to select a gradient whose absolute value is less than a given filtering threshold.
- the filtering threshold therein can be exemplarily calculated by the following formula: Among them, ⁇ gsmp represents the initial filtering threshold, which can be preset before the deep learning training, and dgsmp is also a preset constant. In the deep learning training system, the number of iterations required can be specified in advance, and t represents the current number of iterations in the deep learning training. Dgsmp ⁇ log(t) can dynamically change the filtering threshold as the number of iterations increases. As the number of iterations increases, the filtering threshold becomes smaller and smaller, so that in the later stages of training, small gradients are less likely to be eliminated. In this embodiment, the value of ⁇ gsmp can be between 1x10 -4 and 1x10 -3 , and the value of dgsmp can be between 0.1 and 1. The specific value can be adjusted according to the specific application.
- a number of gradients are selected from the input raw gradient matrix, for example using a stochastic strategy.
- the stochastic strategy randomly selects a given ratio among all the gradient values input, for example, a gradient of 50%-90%, 60%-80%, and the like.
- steps S210-220 may be performed by a processor calling a corresponding instruction stored in the memory, or may be performed by a sparse processing module executed by the processor or a randomly selected sub-module therein.
- step S230 the gradient values selected by the absolute value strategy and the random strategy are not important to the calculation, and the influence is small, and they are set to 0, thereby converting the input gradient matrix into a sparse gradient matrix.
- the sparse gradient matrix is processed using a compression strategy to reduce the volume.
- the compression strategy uses a general compression algorithm such as snappy, zlib, etc. to compress the sparse gradient matrix into a string.
- steps S230-240 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a sparse processing module executed by the processor or a sparse sub-module therein.
- a gradient matrix is outputted with a string by the absolute value strategy and the random operation culling operation and the compression strategy compression operation, and the volume thereof is greatly reduced.
- the calculation node transmits the generated character string through the network, and the network traffic generated by this process is correspondingly reduced, so that the communication time in the gradient accumulation process can be effectively reduced.
- performing sparse processing on at least part of the first data may include: selecting absolute values from the parameter difference matrix to be smaller than respectively Filtering a third partial matrix element of the threshold; randomly selecting a fourth partial matrix element from the parameter difference matrix; setting a value of a matrix element belonging to the third partial matrix element and the fourth partial matrix element in the parameter difference matrix to 0 Sparse parameter difference matrix.
- sending the at least one part of the first data after the sparse processing to the at least one other node may include: compressing the sparse parameter difference matrix into a character string; and sending the character string to the at least one other node through the network.
- FIG. 3 is an exemplary flow chart of parameter filtering in an embodiment of a data transmission method of the present application.
- the newly updated parameters in the deep learning model are represented by ⁇ new
- the old parameters of the cache are represented by ⁇ old.
- this embodiment includes:
- step S310 a number of values are selected from the parameter difference matrix ⁇ diff, for example, using an absolute value strategy.
- the absolute value strategy is to select a gradient whose absolute value is less than a given filtering threshold.
- the filtering threshold therein can be exemplarily calculated by the following formula: Among them, ⁇ gsmp represents the initial filtering threshold, which can be preset before the deep learning training, and dgsmp is also a preset constant. In the deep learning training system, the number of iterations required can be specified in advance, and t represents the current number of iterations in the deep learning training. Dgsmp ⁇ log(t) can dynamically change the filtering threshold as the number of iterations increases. As the number of iterations increases, the filtering threshold becomes smaller and smaller, so that in the later stages of training, small gradients are less likely to be eliminated. In this embodiment, the value of ⁇ gsmp can be between 1x10 -4 and 1x10 -3 , and the value of dgsmp can be between 0.1 and 1. The specific value can be adjusted according to the specific application.
- a number of values are selected from the ⁇ diff matrix, for example using a stochastic strategy.
- the random strategy randomly selects a given ratio in all ⁇ diff matrices input, for example, a gradient of 50%-90%, 60%-80%, and the like.
- steps S310-320 may be performed by the processor invoking corresponding instructions stored in the memory, or may be performed by a sparse processing module executed by the processor or a randomly selected sub-module therein.
- step S330 the ⁇ diff value selected by both the absolute value strategy and the random strategy is set to 0, thereby converting the ⁇ diff matrix into a sparse matrix.
- the sparse matrix is processed using a compression strategy to reduce the volume.
- the compression strategy uses a common compression algorithm, such as snappy, zlib, etc., to compress the sparse matrix into a string.
- the above steps S330-340 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a sparse processing module executed by the processor or a sparse sub-module therein.
- the deep learning training system can greatly reduce the network traffic generated in the parameter broadcast operation by broadcasting the generated character string through the network. Therefore, the communication time can be effectively reduced, thereby reducing the overall deep learning training time.
- the decompression operation is performed, and ⁇ diff is added to the cached ⁇ old to update the corresponding parameter.
- the same node can apply the gradient filtering mode shown in FIG. 2 or the parameter filtering mode shown in FIG. 3, and the corresponding steps are not described herein.
- any of the data transmission methods provided by the embodiments of the present application may be performed by any suitable device having data processing capabilities, including but not limited to: a terminal device, a server, and the like.
- any data transmission method provided by the embodiment of the present application may be executed by a processor, such as the processor, by executing a corresponding instruction stored in the memory to perform any one of the data transmission methods mentioned in the embodiments of the present application. This will not be repeated below.
- the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
- the foregoing steps include the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
- FIG. 4 is a schematic structural diagram of an embodiment of a data transmission system according to the present application.
- the data processing system of the embodiment of the present invention can be used to implement the foregoing various data processing method embodiments of the present application. As shown in FIG. 4, the system of this embodiment includes:
- the data determining module 410 is configured to determine, by the node in the distributed system, the first data to be sent to the at least one other node for parameter updating the deep learning model of the distributed system training;
- a sparse processing module 420 configured to perform sparse processing on at least part of the first data
- the sparse processing module 420 may include: a filtering submodule 422, configured to compare at least part of the first data with a given filtering threshold, and Filtering a portion smaller than a filtering threshold in at least a portion of the comparison of the first data, wherein the filtering threshold is related to the depth learning model The number of training iterations decreases and decreases.
- the data sending module 430 is configured to send, to the at least one other node, the first data that is at least partially subjected to the sparse processing.
- the sparse processing module 420 may further include: a random selection sub-module, configured to randomly determine the sparse processing of at least part of the first data according to a predetermined policy. A portion of a data is at least partially; a sparse sub-module for performing sparse processing on at least a portion of the determined first data.
- the data sending module 430 may include: a compression submodule 432, configured to compress the first data that is at least partially subjected to the sparse processing; and the sending submodule 434, Transmitting the compressed first data to at least one other node.
- FIG. 5 is a schematic structural diagram of another embodiment of a data transmission system according to the present application. As shown in FIG. 5, compared with the embodiment shown in FIG. 4, the data transmission system of this embodiment further includes:
- the data obtaining module 510 is configured to acquire second data that is sent by at least one other node for performing parameter update on the deep learning model of the distributed system training;
- the updating module 520 is configured to update parameters of the deep learning model of any of the nodes according to the second data.
- the data acquisition module 510 may include a receiving and decompressing sub-module 512, configured to receive and decompress at least one other node and send the compressed system for the distributed system.
- the trained deep learning model performs the second data of the parameter update.
- the first data includes: a gradient matrix obtained by any one of the above nodes during the iterative training of the deep learning model; and/or, any of the above nodes in the deep learning model An old parameter of any training during iterative training, and a parameter between the new parameter obtained by updating the old parameter based on at least one parameter updated by at least one other node for parameter updating of the deep learning model of the distributed system training Difference matrix.
- the filtering sub-module 422 is configured to select, from the gradient matrix, a first partial matrix element whose absolute values are respectively smaller than a given filtering threshold; and a random selection sub-module for randomly selecting a second partial matrix element from the gradient matrix;
- the sparse sub-module is used to set the value of the matrix element belonging to the first partial matrix element and the second partial matrix element in the gradient matrix to 0 to obtain a sparse gradient matrix;
- the compression sub-module is used to compress the sparse gradient matrix into a string;
- the submodule sends a string to the at least one other node through the network.
- the filtering sub-module is configured to select, from the parameter difference matrix, a third partial matrix element whose absolute values are respectively smaller than a given filtering threshold; the random selection sub-module is used to randomly from the parameter difference matrix Select the fourth part of the matrix element; the sparse sub-module is used to belong to the third part of the matrix element and the fourth part of the parameter difference matrix
- the value of the matrix element of the matrix element is set to 0 to obtain a sparse parameter difference matrix;
- the compression sub-module is used to compress the sparse parameter difference matrix into a string;
- the sending sub-module is configured to send characters to the at least one other node through the network. string.
- the embodiment of the present application further provides an electronic device, including the data processing system of any of the foregoing embodiments of the present application.
- the embodiment of the present application further provides another electronic device, including:
- the embodiment of the present application further provides another electronic device, including: one or more processors, a memory, a plurality of cache components, a communication component, and a communication bus, the processor, the memory, the plurality of cache units, and the foregoing communication
- the components complete communication with each other through the communication bus, the transmission rates and/or storage spaces of the plurality of cache components are different, and the plurality of cache components are preset with different lookup priorities according to the transmission rate and/or the storage space;
- the memory is configured to store at least one executable instruction, and the executable instruction causes the processor to perform an operation corresponding to the data transmission method of any of the above embodiments of the present application.
- FIG. 6 is a schematic structural diagram of an embodiment of a node device according to the present application. It includes a processor 602, a communication component 604, a memory 606, and a communication bus 608. Communication components can include, but are not limited to, I/O interfaces, network cards, and the like.
- Processor 602, communication component 604, and memory 606 complete communication with one another via communication bus 608.
- the communication component 604 is configured to communicate with network elements of other devices, such as a client or a data collection device.
- the processor 602 is configured to execute the program 610. Specifically, the related steps in the foregoing method embodiments may be performed.
- the program can include program code, the program code including computer operating instructions.
- the processor 602 may be one or more, and the device may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more configured to implement the embodiments of the present application. Integrated circuits, etc.
- CPU central processing unit
- ASIC application specific integrated circuit
- the memory 606 is configured to store the program 610.
- Memory 606 may include high speed RAM memory and may also include non-volatile memory, such as at least one disk memory.
- the program 610 includes at least one executable instruction, and may be specifically configured to cause the processor 602 to perform an operation of determining parameters of a deep learning model for distributed system training to be sent by any node in the distributed system to at least one other node. Updating the first data; performing sparse processing on at least part of the first data; and transmitting at least part of the first data after the sparse processing to the at least one other node.
- FIG. 7 is a schematic structural diagram of an embodiment of an electronic device according to the present application.
- the electronic device includes one or more processors, communication units, etc., one or more processors such as one or more central processing units (CPUs) 701, and/or one or more image processing
- CPUs central processing units
- the processor can perform various appropriate operations according to executable instructions stored in the read only memory (ROM) 702 or executable instructions loaded from the storage portion 708 into the random access memory (RAM) 703. Action and processing.
- the communication portion 712 can include, but is not limited to, a network card, which can include, but is not limited to, an IB (Infiniband) network card, and the processor can communicate with the read only memory 702 and/or the random access memory 703 to execute executable instructions over the bus 704.
- the unit 712 is connected to and communicates with other target devices via the communication unit 712, thereby performing operations corresponding to any data processing method provided by the embodiment of the present application, for example, determining that any node in the distributed system is to be sent to at least one other node.
- first data for parameter updating the deep learning model trained by the distributed system; performing sparse processing on at least part of the first data; and transmitting at least part of the sparse processing to the at least one other node The first data.
- RAM 703 various programs and data required for the operation of the device can be stored.
- the CPU 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704.
- ROM 702 is an optional module.
- the RAM 703 stores executable instructions or writes executable instructions to the ROM 702 at runtime, the executable instructions causing the processor 701 to perform operations corresponding to the data processing methods described above.
- An input/output (I/O) interface 705 is also coupled to bus 704.
- the communication portion 712 may be integrated or may be provided with a plurality of sub-modules (eg, a plurality of IB network cards) and on the bus link.
- the following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, etc.; an output portion 707 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a speaker; a storage portion 708 including a hard disk or the like And a communication portion 709 including a network interface card such as a LAN card, a modem, or the like.
- the communication section 709 performs communication processing via a network such as the Internet.
- Driver 710 is also connected to I/O interface 705 as needed.
- a removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like, is mounted on the drive 710 as needed so that a computer program read therefrom is installed into the storage portion 708 as needed.
- FIG. 7 is only an optional implementation manner.
- the number and type of components in FIG. 7 may be selected, deleted, added, or replaced according to actual needs;
- Different function components can also be implemented in separate settings or integrated settings, such as GPU and CPU detachable settings or GPU can be integrated on the CPU, the communication part can be separated, or integrated on the CPU or GPU. and many more.
- an embodiment of the present disclosure includes a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart, the program code comprising Executing instructions corresponding to the method steps provided in the embodiments of the present application, for example, determining, by using any node in the distributed system, to update parameters of the deep learning model trained by the distributed system to at least one other node An instruction of data; an instruction to perform sparse processing on at least a portion of the first data; and an instruction to at least partially perform the first data after the sparse processing to the at least one other node.
- the embodiment of the present application further provides a computer program, including computer readable code, when the computer readable code is run on a device, the processor in the device executes to implement any of the embodiments of the present application. Instructions for each step in the data transfer method.
- the embodiment of the present application further provides a computer readable storage medium for storing a computer readable instruction, when the instruction is executed, implementing the operations of each step in the data transmission method of any embodiment of the present application.
- the above method according to an embodiment of the present application may be implemented in hardware, firmware, or implemented as a recordable medium.
- Software or computer code in quality (such as CD ROM, RAM, floppy disk, hard disk or magneto-optical disk), or implemented as being downloaded over a network, originally stored in a remote recording medium or a non-transitory machine readable medium and stored in The computer code in the local recording medium, such that the methods described herein can be stored in such software processing on a recording medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware such as an ASIC or an FPGA.
- a computer, processor, microprocessor controller or programmable hardware includes storage components (eg, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is The processing methods described herein are implemented when the processor or hardware is accessed and executed. Moreover, when a general purpose computer accesses code for implementing the processing shown herein, the execution of the code converts the general purpose computer into a special purpose computer for performing the processing shown herein.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Complex Calculations (AREA)
- Mobile Radio Communication Systems (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims (23)
- 一种数据传输方法,其特征在于,包括:确定分布式系统中一节点向至少一其他节点待发送的、用于对所述分布式系统训练的深度学习模型进行参数更新的第一数据;对所述第一数据中的至少部分进行稀疏处理;向所述至少一其他节点发送至少部分进行稀疏处理后的第一数据。
- 根据权利要求1所述的方法,其特征在于,对所述第一数据中的至少部分进行稀疏处理,包括:将所述第一数据中的至少部分分别与给定过滤阈值进行比较,并从所述至少部分中滤除小于所述过滤阈值的部分,其中,所述过滤阈值随所述深度学习模型的训练迭代次数的增加而减小。
- 根据权利要求1或2所述的方法,其特征在于,对所述第一数据中的至少部分进行稀疏处理之前,还包括:随机确定所述第一数据的部分作为所述至少部分;对确定的所述第一数据的至少部分进行稀疏处理。
- 根据权利要求1-3任一所述的方法,其特征在于,所述向所述至少一其他节点发送至少部分进行稀疏处理后的第一数据,包括:压缩所述至少部分进行稀疏处理后的第一数据;向所述至少一其他节点发送压缩后的第一数据。
- 根据权利要求1-4任一所述的方法,其特征在于,还包括:获取所述至少一其他节点发送的、用于对所述分布式系统训练的深度学习模型进行参数更新的第二数据;至少根据所述第二数据对所述深度学习模型的参数进行更新。
- 根据权利要求5所述的方法,其特征在于,获取所述至少一其他节点发送的用于对所述分布式系统训练的深度学习模型进行参数更新的第二数据,包括:接收并解压缩所述至少一其他节点压缩后发送的用于对所述分布式系统训练的深度学习模型进行参数更新的第二数据。
- 根据权利要求1-6任一所述的方法,其特征在于,所述第一数据包括:在所述深度学习模型的迭代训练期间任一次训练过程计算所得到的梯度矩阵;和/或,在所述深度学习模型的迭代训练期间任一次训练的旧参数、与至少根据所述至少一其 他节点发送的用于对所述分布式系统训练的深度学习模型进行参数更新的第二数据进行所述旧参数更新所得到的新参数之间的参数差值矩阵。
- 根据权利要求7所述的方法,其特征在于,在所述第一数据包括所述梯度矩阵时,对所述第一数据中的至少部分进行稀疏处理,包括:从所述梯度矩阵选取绝对值分别小于所述过滤阈值的第一部分矩阵元素;从所述梯度矩阵随机选取第二部分矩阵元素;将所述梯度矩阵中同时属于所述第一部分矩阵元素和所述第二部分矩阵元素的矩阵元素的数值置0,得到稀疏梯度矩阵;向所述至少一其他节点发送至少部分进行稀疏处理后的第一数据,包括:将所述稀疏梯度矩阵压缩为一个字符串;通过网络向所述至少一其他节点发送所述字符串。
- 根据权利要求7或8所述的方法,其特征在于,在所述第一数据包括所述参数差值矩阵时,对所述第一数据中的至少部分进行稀疏处理,包括:从所述参数差值矩阵选取绝对值分别小于所述过滤阈值的第三部分矩阵元素;从所述参数差值矩阵随机选取第四部分矩阵元素;将所述参数差值矩阵中同时属于所述第三部分矩阵元素和所述第四部分矩阵元素的矩阵元素的数值置0,得到稀疏参数差值矩阵;向所述至少一其他节点发送至少部分进行稀疏处理后的第一数据,包括:将所述稀疏参数差值矩阵压缩为一个字符串;通过网络向所述至少一其他节点发送所述字符串。
- 一种数据传输系统,其特征在于,包括:数据确定模块,用于确定分布式系统中一节点向至少一其他节点待发送的、用于对所述分布式系统训练的深度学习模型进行参数更新的第一数据;稀疏处理模块,用于对所述第一数据中的至少部分进行稀疏处理;数据发送模块,用于向所述至少一其他节点发送至少部分进行稀疏处理后的第一数据。
- 根据权利要求10所述的系统,其特征在于,所述稀疏处理模块包括:过滤子模块,用于将所述第一数据中的至少部分分别与给定过滤阈值进行比较,并从所述至少部分中滤除小于所述过滤阈值的部分,其中,所述过滤阈值随所述深度学习模型的训练迭代次数的增加而减小。
- 根据权利要求10或11所述的系统,其特征在于,所述稀疏处理模块还包括:随机选取子模块,用于随机确定所述第一数据的部分作为所述至少部分;稀疏子模块,用于对确定的所述第一数据的至少部分进行稀疏处理。
- 根据权利要求10-12任一所述的系统,其特征在于,所述数据发送模块包括:压缩子模块,用于压缩所述至少部分进行稀疏处理后的第一数据;发送子模块,用于向所述至少一其他节点发送压缩后的第一数据。
- 根据权利要求10-13任一所述的系统,其特征在于,还包括:数据获取模块,用于获取所述至少一其他节点发送的用于对所述分布式系统训练的深度学习模型进行参数更新的第二数据;更新模块,用于至少根据所述第二数据对所述深度学习模型的参数进行更新。
- 根据权利要求14所述的系统,其特征在于,所述数据获取模块包括:接收和解压缩子模块,用于接收并解压缩所述至少一其他节点压缩后发送的用于对所述分布式系统训练的深度学习模型进行参数更新的第二数据。
- 根据权利要求10-15任一所述的系统,其特征在于,所述第一数据包括:在所述深度学习模型的迭代训练期间任一次训练过程计算所得到的梯度矩阵;和/或,在所述深度学习模型的迭代训练期间任一次训练的旧参数、与至少根据所述至少一其他节点发送的用于对所述分布式系统训练的深度学习模型进行参数更新的第二数据进行所述旧参数更新所得到的新参数之间的参数差值矩阵。
- 根据权利要求16所述的系统,其特征在于,在所述第一数据包括所述梯度矩阵时,所述过滤子模块用于从所述梯度矩阵选取绝对值分别小于所述过滤阈值的第一部分矩阵元素;所述随机选取子模块用于从所述梯度矩阵随机选取第二部分矩阵元素;所述稀疏子模块用于将所述梯度矩阵中同时属于所述第一部分矩阵元素和所述第二部分矩阵元素的矩阵元素的数值置0,得到稀疏梯度矩阵;所述压缩子模块用于将所述稀疏梯度矩阵压缩为一个字符串;所述发送子模块通过网络向所述至少一其他节点发送所述字符串。
- 根据权利要求16或17所述的系统,其特征在于,在所述第一数据包括所述参数差值矩阵时,所述过滤子模块用于从所述参数差值矩阵选取绝对值分别小于所述过滤阈值的第三部分矩阵元素;所述随机选取子模块用于从所述参数差值矩阵随机选取第四部分矩阵元素;所述稀疏子模块用于将所述参数差值矩阵中同时属于所述第三部分矩阵元素和所述 第四部分矩阵元素的矩阵元素的数值置0,得到稀疏参数差值矩阵;所述压缩子模块用于将所述稀疏参数差值矩阵压缩为一个字符串;所述发送子模块用于通过网络向所述至少一其他节点发送所述字符串。
- 一种电子设备,其特征在于,包括权利要求10-18任一所述的数据传输系统。
- 一种电子设备,其特征在于,包括:处理器和权利要求10-18任一所述的数据传输系统;在处理器运行所述数据传输系统时,权利要求10-18任一所述的数据传输系统中的单元被运行。
- 一种电子设备,其特征在于,包括:一个或多个处理器、存储器、通信部件和通信总线,所述处理器、所述存储器和所述通信部件通过所述通信总线完成相互间的通信;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1-9任一所述的数据传输方法对应的操作。
- 一种计算机程序,包括计算机可读代码,其特征在于,当所述计算机可读代码在设备上运行时,所述设备中的处理器执行用于实现权利要求1-9任一所述的数据传输方法中各步骤的指令。
- 一种计算机可读存储介质,用于存储计算机可读取的指令,其特征在于,所述指令被执行时实现权利要求1-9任一所述的数据传输方法中各步骤的操作。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/382,058 US20190236453A1 (en) | 2016-10-28 | 2019-04-11 | Method and system for data transmission, and electronic device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610972729.4A CN108021982B (zh) | 2016-10-28 | 2016-10-28 | 数据传输方法和系统、电子设备 |
CN201610972729.4 | 2016-10-28 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/382,058 Continuation US20190236453A1 (en) | 2016-10-28 | 2019-04-11 | Method and system for data transmission, and electronic device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018077293A1 true WO2018077293A1 (zh) | 2018-05-03 |
Family
ID=62023122
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/108450 WO2018077293A1 (zh) | 2016-10-28 | 2017-10-30 | 数据传输方法和系统、电子设备 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20190236453A1 (zh) |
CN (1) | CN108021982B (zh) |
WO (1) | WO2018077293A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109740755A (zh) * | 2019-01-08 | 2019-05-10 | 深圳市网心科技有限公司 | 一种基于梯度下降法的数据处理方法及相关装置 |
CN112364897A (zh) * | 2020-10-27 | 2021-02-12 | 曙光信息产业(北京)有限公司 | 分布式训练方法及装置、存储介质及电子设备 |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109214512B (zh) * | 2018-08-01 | 2021-01-22 | 中兴飞流信息科技有限公司 | 一种深度学习的参数交换方法、装置、服务器及存储介质 |
CN109871942B (zh) * | 2019-02-19 | 2021-06-11 | 上海商汤智能科技有限公司 | 神经网络的训练方法和装置、系统、存储介质 |
CN110245743A (zh) * | 2019-05-23 | 2019-09-17 | 中山大学 | 一种异步分布式深度学习训练方法、装置及系统 |
US11451480B2 (en) * | 2020-03-31 | 2022-09-20 | Micron Technology, Inc. | Lightweight artificial intelligence layer to control the transfer of big data |
CN111625603A (zh) * | 2020-05-28 | 2020-09-04 | 浪潮电子信息产业股份有限公司 | 一种分布式深度学习的梯度信息更新方法及相关装置 |
CN111857949B (zh) * | 2020-06-30 | 2023-01-10 | 苏州浪潮智能科技有限公司 | 模型发布方法、装置、设备及存储介质 |
CN112235384B (zh) * | 2020-10-09 | 2023-10-31 | 腾讯科技(深圳)有限公司 | 分布式系统中的数据传输方法、装置、设备及存储介质 |
CN113242258B (zh) * | 2021-05-27 | 2023-11-14 | 安天科技集团股份有限公司 | 一种主机集群的威胁检测方法和装置 |
CN113610210B (zh) * | 2021-06-28 | 2024-03-29 | 深圳大学 | 基于智能网卡的深度学习训练网络迭代更新方法 |
CN116980420B (zh) * | 2023-09-22 | 2023-12-15 | 新华三技术有限公司 | 一种集群通信方法、系统、装置、设备及介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102405495A (zh) * | 2009-03-11 | 2012-04-04 | 谷歌公司 | 使用稀疏特征对信息检索进行音频分类 |
CN105574506A (zh) * | 2015-12-16 | 2016-05-11 | 深圳市商汤科技有限公司 | 基于深度学习和大规模集群的智能人脸追逃系统及方法 |
CN105791189A (zh) * | 2016-02-23 | 2016-07-20 | 重庆大学 | 一种提高重构精度的稀疏系数分解方法 |
WO2016154440A1 (en) * | 2015-03-24 | 2016-09-29 | Hrl Laboratories, Llc | Sparse inference modules for deep learning |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6970939B2 (en) * | 2000-10-26 | 2005-11-29 | Intel Corporation | Method and apparatus for large payload distribution in a network |
US7843855B2 (en) * | 2001-09-13 | 2010-11-30 | Network Foundation Technologies, Llc | System and method for broadcasting content to nodes on computer networks |
GB2493956A (en) * | 2011-08-24 | 2013-02-27 | Inview Technology Ltd | Recommending audio-visual content based on user's personal preerences and the profiles of others |
CN105989368A (zh) * | 2015-02-13 | 2016-10-05 | 展讯通信(天津)有限公司 | 一种目标检测方法及装置以及移动终端 |
CN104714852B (zh) * | 2015-03-17 | 2018-05-22 | 华中科技大学 | 一种适用于分布式机器学习的参数同步优化方法及其系统 |
CN105005911B (zh) * | 2015-06-26 | 2017-09-19 | 深圳市腾讯计算机系统有限公司 | 深度神经网络的运算系统及运算方法 |
CN104966104B (zh) * | 2015-06-30 | 2018-05-11 | 山东管理学院 | 一种基于三维卷积神经网络的视频分类方法 |
CN105786757A (zh) * | 2016-02-26 | 2016-07-20 | 涂旭平 | 一种板上集成分布式高性能运算系统装置 |
-
2016
- 2016-10-28 CN CN201610972729.4A patent/CN108021982B/zh active Active
-
2017
- 2017-10-30 WO PCT/CN2017/108450 patent/WO2018077293A1/zh active Application Filing
-
2019
- 2019-04-11 US US16/382,058 patent/US20190236453A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102405495A (zh) * | 2009-03-11 | 2012-04-04 | 谷歌公司 | 使用稀疏特征对信息检索进行音频分类 |
WO2016154440A1 (en) * | 2015-03-24 | 2016-09-29 | Hrl Laboratories, Llc | Sparse inference modules for deep learning |
CN105574506A (zh) * | 2015-12-16 | 2016-05-11 | 深圳市商汤科技有限公司 | 基于深度学习和大规模集群的智能人脸追逃系统及方法 |
CN105791189A (zh) * | 2016-02-23 | 2016-07-20 | 重庆大学 | 一种提高重构精度的稀疏系数分解方法 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109740755A (zh) * | 2019-01-08 | 2019-05-10 | 深圳市网心科技有限公司 | 一种基于梯度下降法的数据处理方法及相关装置 |
CN109740755B (zh) * | 2019-01-08 | 2023-07-18 | 深圳市网心科技有限公司 | 一种基于梯度下降法的数据处理方法及相关装置 |
CN112364897A (zh) * | 2020-10-27 | 2021-02-12 | 曙光信息产业(北京)有限公司 | 分布式训练方法及装置、存储介质及电子设备 |
CN112364897B (zh) * | 2020-10-27 | 2024-05-28 | 曙光信息产业(北京)有限公司 | 分布式训练方法及装置、存储介质及电子设备 |
Also Published As
Publication number | Publication date |
---|---|
US20190236453A1 (en) | 2019-08-01 |
CN108021982A (zh) | 2018-05-11 |
CN108021982B (zh) | 2021-12-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018077293A1 (zh) | 数据传输方法和系统、电子设备 | |
CN113808231B (zh) | 信息处理方法及装置、图像渲染方法及装置、电子设备 | |
WO2017143747A1 (zh) | 一种移动终端网络请求方法及系统 | |
WO2023051035A1 (zh) | 机器人的数据传输方法及装置、电子设备、存储介质 | |
CN113157480A (zh) | 错误信息处理方法、装置、存储介质以及终端 | |
CN112671892A (zh) | 数据传输方法、装置、电子设备、介质和计算机程序产品 | |
US20160127745A1 (en) | Efficient screen image transfer | |
CN115186738B (zh) | 模型训练方法、装置和存储介质 | |
CN115904240A (zh) | 数据处理方法、装置、电子设备和存储介质 | |
CN114386577A (zh) | 用于执行深度学习模型的方法、设备和存储介质 | |
CN113344213A (zh) | 知识蒸馏方法、装置、电子设备及计算机可读存储介质 | |
CN116611495B (zh) | 深度学习模型的压缩方法、训练方法、处理方法及装置 | |
CN116341689B (zh) | 机器学习模型的训练方法、装置、电子设备及存储介质 | |
CN115294396B (zh) | 骨干网络的训练方法以及图像分类方法 | |
CN117075920A (zh) | 应用程序安装包的优化方法、装置 | |
CN115049051A (zh) | 一种模型权重的压缩方法、装置、电子设备及存储介质 | |
CN117112601A (zh) | 一种数据库数据压缩方法、装置、设备及存储介质 | |
CN117556075A (zh) | 应用于pacs系统的数据处理方法、装置、设备及介质 | |
CN115906982A (zh) | 分布式训练方法、梯度通信方法、装置及电子设备 | |
CN113781494A (zh) | 图像分割方法、装置、电子设备和计算机可读介质 | |
CN112990422A (zh) | 参数服务器、客户机、权值参数的处理方法及系统 | |
CN117351299A (zh) | 图像生成及模型训练方法、装置、设备和存储介质 | |
CN112988366A (zh) | 参数服务器、主从客户机、权值参数的处理方法及系统 | |
CN116310518A (zh) | 图像分类模型的训练方法、分类方法、设备以及相关装置 | |
CN116016484A (zh) | 数据传输方法、装置、设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17865594 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17865594 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 14.08.2019) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17865594 Country of ref document: EP Kind code of ref document: A1 |