CN116033492A - Method and device for segmenting transducer model in mobile edge environment - Google Patents

Method and device for segmenting transducer model in mobile edge environment Download PDF

Info

Publication number
CN116033492A
CN116033492A CN202211624092.1A CN202211624092A CN116033492A CN 116033492 A CN116033492 A CN 116033492A CN 202211624092 A CN202211624092 A CN 202211624092A CN 116033492 A CN116033492 A CN 116033492A
Authority
CN
China
Prior art keywords
devices
model
mobile edge
transducer
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211624092.1A
Other languages
Chinese (zh)
Inventor
屈志昊
周文轩
叶保留
王博文
柳泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Industrial Control Safety Innovation Technology Co ltd
Hohai University HHU
Original Assignee
Shanghai Industrial Control Safety Innovation Technology Co ltd
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Industrial Control Safety Innovation Technology Co ltd, Hohai University HHU filed Critical Shanghai Industrial Control Safety Innovation Technology Co ltd
Priority to CN202211624092.1A priority Critical patent/CN116033492A/en
Publication of CN116033492A publication Critical patent/CN116033492A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a segmentation method and a segmentation device for a transducer model in a mobile edge environment, wherein the method comprises the following steps: selecting the number of mobile edge devices which is not more than the number of layers of a transducer model encoder as a device group, acquiring the computing capacity and the transmitting power of the devices, and calculating the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise; solving the modeled optimization problem by using a branch delimitation method to obtain an optimal transmission path; and based on the obtained optimal transmission path, the transducer model is placed on corresponding mobile edge equipment in a layer division mode. The invention improves the communication efficiency and the overall calculation performance of the system, and reduces the operation complexity of splitting and arranging the network.

Description

Method and device for segmenting transducer model in mobile edge environment
Technical Field
The invention relates to the technical field of distributed computing, in particular to a method and a device for segmenting a transducer model in a mobile edge environment.
Background
The breakthrough progress in deep learning has prompted the development of artificial intelligence driven fields such as computer vision, natural language processing, etc., and the success of deep neural networks has been derived from training large-scale data sets on large-scale models. However, the increasing model and data sizes have prevented stand-alone devices from completing one training of deep neural network models. Thus, training with cloud computing is a natural choice. Under the traditional cloud computing framework, data sources need to offload self-generated data to the cloud to uniformly use a large-scale distributed machine learning scheme (such as data parallelism) for computing. However, because various sensors and internet of things devices generate a large amount of data (for example, more than 1 TB) per hour, transmitting the data to the cloud end causes a large amount of communication overhead; furthermore, offloading of data will also present privacy security issues such as malicious servers, hacking, etc. In order to solve these two problems, researchers have proposed a machine learning paradigm based on split learning (also called split learning), which is to divide a machine learning model into a plurality of parts, and place the parts on different devices (a client and a server) to perform different propagation training, so that users are not required to transmit original data to a cloud, and thus communication overhead and unloading security risks are avoided.
The transducer model is a deep learning model employing a self-attention mechanism that computes weights and weights sums for each portion of input data, and early transducers were mainly used in the field of natural language processing. Similar to recurrent neural networks, the transducers are intended to process sequence data (e.g., text translation). However, unlike recurrent neural networks, which recursively process data, the transducer processes all data simultaneously, so that the transducer not only solves the "forget" disadvantage in the recurrent neural network, but also can parallelize to accelerate the training of the model, and the present transducer model can be applied not only in the field of natural language processing but also in the field of computer vision. As shown in fig. 1, a general transducer model mainly includes two parts, an encoder module and a decoder module, respectively. In natural language processing (e.g., translation tasks), the transducer model requires the use of both encoder and decoder modules, while for classification tasks (e.g., text classification, picture classification), the transducer model uses encoder modules. The invention is applied to classification tasks using only encoder modules. The encoder module of the transducer model consists of several isomorphic encoders, with the output of each encoder being the input of the next encoder, for which an embedded layer mapping to a low dimensional space and location information is added to the input. The encoder mainly comprises two parts, namely a multi-head attention module and a feedforward neural network, and the convergence efficiency of the model is accelerated by using a residual connection and layer normalization method after each module.
The usual segmentation learning breaks a complete model into two parts, a client network and a server network, respectively. And a batch of edge devices and a server cooperatively train the whole model, and each edge device sequentially updates a client network and a server network by utilizing data generated by the edge devices and sends the updated model to the next edge device. The method not only ensures the privacy of the self data of the edge equipment, but also reduces the self calculation and storage cost, and is an efficient distributed machine learning paradigm. However, such split learning may bring about intolerable communication overhead, especially communication overhead between the edge device and the server, which not only requires lengthy communication time and consumes a lot of energy of its own, but also generates high economic cost. Multi-hop segmentation learning can solve the problem of expensive communication overhead, however, due to the high degree of heterogeneity of edge devices, its computational power and memory power are quite different, which results in arbitrary network splitting and orchestration into a set of edge devices, which can severely reduce training efficiency of the model, although the encoder of the transducer model is isomorphic, the multi-hop segmentation learning is intended to be utilized, the most efficient training of a transducer model, the splitting and orchestration part of which is an NP-hard problem, cannot be solved using algorithms within polynomial time.
Disclosure of Invention
The invention aims to: in order to overcome the defects and shortcomings of the prior art, the invention provides a segmentation method and a segmentation device for a Transformer model in a mobile edge environment, which are based on a multi-hop segmentation learning paradigm, improve the communication efficiency and the overall calculation performance of a system and reduce the calculation complexity of splitting and arranging a network on the premise of protecting the privacy of edge equipment data.
The technical scheme is as follows: a segmentation method of a transducer model in a mobile edge environment comprises the following steps:
selecting the number of mobile edge devices which is not more than the number of layers of a transducer model encoder as a device group, acquiring the computing capacity and the transmitting power of the devices, and calculating the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise;
the method comprises the steps of establishing an optimization problem model by taking the most-capable edge equipment with the highest computing capacity as a transducer layer and the rest equipment as a segmentation basis and taking the computation time of a minimized transducer model as a target based on D2D communication time between the equipment, and solving the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path;
and based on the obtained optimal transmission path, the transducer model is placed on corresponding mobile edge equipment in a layer division mode.
Preferably, the D2D communication time is calculated as follows:
Figure BDA0004003322720000031
wherein t is ij Representing the transmission time of device i to device j, a representing the amount of data transmitted, s ij Representing the data transfer rate of device i to device j, W is the bandwidth of the channel,alpha is the path loss index, N o Is the noise power, p i Representing the transmit power of device i, d ij Representing the physical distance between device i and device j.
Preferably, the modeled optimization problem is as follows:
Figure BDA0004003322720000032
the constraint conditions are as follows:
Figure BDA0004003322720000033
wherein N is the number of mobile edge devices in the device group, and x is the number of mobile edge devices in the device group ij To indicate a variable, a value of 1 indicates that device i transmits data to device j, and a value of 0 indicates that device i does not transmit data to devices j, w i Representing the time, z, required for the device i to calculate the transducer model for each layer i Is an integer that limits one connected component.
Preferably, obtaining the optimal transmission path using the branch-and-bound algorithm includes: obtaining a suboptimal solution by using a genetic algorithm as an upper bound of a search algorithm, generating a priority queue and adding a starting node into the queue; the following is then performed when the queue is not empty and the lower bound for the head point is less than the global upper bound: and (3) taking out the queue head element, traversing the rest nodes, adding the nodes into the priority queue if the lower bound obtained by adding the traversed nodes into the path is smaller than or equal to the current upper bound, and updating the upper bound if the leaf nodes are searched and the total consumption of the path is smaller than the current global upper bound.
Preferably, placing the fransformer model on the corresponding mobile edge device in layer segmentation comprises: and sequentially deploying the transducer models into the edge devices according to the transmission paths, wherein the edge device with the strongest computing capacity holds the most transducer layers, namely the number of network layers is reduced by the number of devices plus 1, and all other devices hold a layer of network model.
The invention also provides a segmentation device of the transducer model in the mobile edge environment, which comprises:
the communication time determining module is configured to select the number of the mobile edge devices which is not more than the number of layers of the transducer model encoder as a device group, acquire the computing capacity and the transmitting power of the devices, and calculate the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise;
the path calculation module is configured to set up an optimization problem model by taking the most transducer layer held by the edge equipment with the strongest computing capability and the rest equipment with one transducer layer as a segmentation basis and taking the computing time of the minimum transducer model as a target based on the D2D communication time between the equipment, and solve the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path;
and the model segmentation module is configured to place the transducer model on the corresponding mobile edge equipment in a layer segmentation mode based on the obtained optimal transmission path.
The present invention also provides a computer device comprising: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, which when executed by the processors implement the steps of the method for partitioning a transducer model in a mobile edge environment as described above.
The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of a method for partitioning a Transformer model in a mobile edge environment as described above.
The invention also provides a mobile edge computing system, which comprises a plurality of mobile edge devices, wherein the mobile edge devices are distributed with partitioned transducer models based on an optimal transmission path, the number of the mobile edge devices is not more than the number of layers of the transducer models, and the optimal transmission path is obtained based on the following method:
selecting the number of mobile edge devices which is not more than the number of layers of a transducer model encoder as a device group, acquiring the computing capacity and the transmitting power of the devices, and calculating the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise; and establishing an optimization problem model by taking the most-capable edge equipment with the highest computing capacity as a division basis and taking the minimum computing time of the transducer model as a target based on the D2D communication time between the equipment, and solving the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path.
Compared with the prior art, the invention has the following advantages and beneficial effects: the invention provides a high-efficiency training method based on a multi-hop segmentation learning paradigm based on the characteristics of a Transformer model, wherein a strategy for splitting and arranging a network with approximate ratio of 2 can be obtained in polynomial time under the condition that network topology meets a certain condition, and a searching strategy based on a branch-and-bound method is provided under other conditions, so that the time complexity of optimal splitting and arranging of violent searching is greatly reduced. The invention solves the problem of how to divide the network in multi-hop division learning, and greatly improves the resource utilization rate and the speed of the transform model training.
Drawings
FIG. 1 is a block diagram of a generic transducer model;
fig. 2 (a) and 2 (b) are two basic architecture diagrams of basic segmentation learning;
FIG. 3 is a basic architecture diagram of multi-hop segmentation learning;
FIG. 4 is a flow chart of a method for partitioning a transducer model according to the present invention.
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings.
The basic split learning has two forms, namely, an original split learning and a U-shaped split learning, and their configurations are shown in fig. 2 (a) and fig. 2 (b). In the original segmentation learning, the training flow is as follows:
1) The client cuts the model into two parts, wherein the former part is placed at the client and the latter part is placed at the server;
2) The client side executes forward propagation until a cutting layer, and transmits intermediate data to the server side;
3) The server receives the intermediate data and continues to perform forward propagation, performs backward propagation until a cutting layer is obtained after a calculation result is obtained, updates network parameters, and transmits the intermediate data to the client;
4) The client receives the intermediate data, continues back propagation and updates the network parameters, and transmits the network parameters to the next client after the calculation is completed;
5) Repeating 2) to 4) until the model converges.
The U-shaped segmentation learning aims to protect the tag privacy while protecting the original data characteristics, the network model is divided into three parts, and the last part is put back to the client side for calculation, so that the purpose of protecting the tag privacy is achieved.
The above segmentation learning is based on a client server architecture, and in the training process, not only communication delay which is difficult to tolerate and wastes self energy, but also expensive economic cost can be generated, in addition, some varieties exist in the segmentation learning, such as multi-hop segmentation learning, and the segmentation learning is a decentralization architecture, and the architecture is shown in fig. 3, under the architecture, a network is segmented into a plurality of pieces of speed, and the pieces of speed are respectively placed in similar edge equipment, and an internal Device to Device (D2D) communication mode is adopted, so that the load on a service base station is reduced, and on the premise of protecting data privacy, not only the communication cost is reduced, but also the economic cost is saved. The training process is as follows:
1) The network is arranged in n different devices according to a specified dividing mode;
2) Device 1 to device n perform forward propagation in sequence;
3) Device n to device 1 perform back propagation and update parameters in sequence;
3) Repeating 2) to 3) until the model converges.
The invention is based on a multi-layer transducer model, and aims to find an optimal segmentation scheme and an optimal placement scheme for the segmented submodels, so that the training speed of the whole network is the fastest. The invention proposes an algorithm for solving the problem, named M-SLT algorithm. The algorithm relies on a decentralised architecture in which the transducer model is divided into a number of sub-models and placed on a set of edge devices respectively, with intermediate results being exchanged over the D2D link. Compared to client-server architecture, M-SLT improves communication efficiency because D2D communication typically has lower latency and higher data rates than communicating with edge servers over cellular links. Furthermore, by fully exploiting the computing power of the edge devices, segmentation learning can be performed in a resource-saving and flexible manner. The algorithm is characterized in that the arrangement mode of a transducer model sub-network on edge equipment is found, the optimal arrangement mode is obtained through the algorithm, so that the optimal model training speed is obtained, after the optimal arrangement is obtained, the models are sequentially placed on the edge equipment, wherein the equipment with the strongest computing power is arranged for the largest network layer number, the rest equipment is arranged for one layer, for example, three equipment is arranged for 2,1 and 3, five layers of networks are obtained, the computing power of the equipment 1 is the strongest, then the equipment 2 is arranged for the 1 st layer of network, the equipment 1 is arranged for the 2 nd to 4 th layer of network, and the equipment 3 is arranged for the 5 th layer of network.
The problem is modeled as follows:
using
Figure BDA0004003322720000062
Representing mobile edge devices within a group of devices, wherein device 1 has original training data, in a mobile edge environment, when device i passes its own intermediate result to the remaining nodes, its transmit power is set to p i . For any two devices i and j, use d ij Representing the physical distance between them. Thus, the data transmission rate from device i to device j can be derived using the shannon formula:
Figure BDA0004003322720000061
where W is the bandwidth of the channel, α is the path loss index, N o Is the noise power. Due to the inherent computational properties of the transducer model, the middle between each two layersThe resulting data amount is a constant, denoted by A, and thus the transmission time t from device i to device j ij =A/s ij . The core of the problem is thus to find an arrangement (where device 1 should be the starting point) that enables the edge device under the arrangement to train efficiently, and when the arrangement is found, arrange the number of network layers in turn onto the edge device.
To build this optimization problem, an indicator variable x is defined ij When it is 1 it represents device i transmitting data to device j (otherwise x ij 0). Based on the previous assumptions, the indicated variables must meet the following constraints:
Figure BDA0004003322720000071
Figure BDA0004003322720000072
however, these two constraints do not guarantee that there is only one connected component in the network transmission topology. Therefore, the following constraints need to be introduced:
Figure BDA0004003322720000074
z i representing a virtual variable for device i, the constraint may ensure that the path formed by modeling contains only one connected component (i.e., no sub-loop), which may be demonstrated using the anti-prover method. (1) If device i is transmitted to device j, then x ij =1, then z will be derived from this condition j ≥z i The value of +1, z, increases consistently along the loop. (2) Assuming that there is a sub-loop, satisfying the above constraint, there will be one that does not include the starting device (number 1), e.g., 2- > 3- > 2, then z 2 ≥z 3 +1≥z 2 +2 creates a contradiction. The modeling therefore does not include sub-loops.
Using w i Indicating device i meterThe time required for each layer of the transducer model is calculated, and therefore the following optimization problem can be established:
Figure BDA0004003322720000073
which satisfies the following constraints:
Figure BDA0004003322720000081
it can be shown that the traveller's problem can be reduced to the optimization problem in polynomial time, and that one city in the traveller's problem can be mapped to one edge device of the optimization problem, both of which can be abstracted to find a shortest hamiltonian path in the figure. The optimization problem is therefore an NP-hard problem that can only be solved using algorithms that approximate solutions or brute force searches.
Referring to fig. 4, the method of the present invention mainly comprises the following steps: firstly, determining the communication time between every two devices in the device group according to indexes such as signal transmitting power of each device in the device group, distance between every two devices, channel bandwidth, channel noise and the like. The transmission path is calculated using a branch-and-bound algorithm. And finally, dividing, distributing and training the network model according to the solved paths.
Specifically:
the device group is abstracted into a complete graph g= (V, E), wherein the device is node V in the graph, and the communication time plus the computation time is edge E in the graph. The complete graph is stored using the adjacency matrix C. In the search algorithm, the following definitions will be used:
determined path U: for the current device searched, the determined path u= (r 1 ,r 2 ,...r k ) U represents a sequence, r k Representing the kth device in the sequence;
route U with origin removed 1 : in path U, r is removed 1 The obtained path U 1
Path U for removing end point 2 : in path U, r is removed k The obtained path U 2
Upper bound ub: and initializing a transmission sequence arbitrarily, taking the total time consumption of the obtained path as the upper bound of the search space, and updating the upper bound if the total time consumption of the path of the leaf node is smaller than the upper bound each time the leaf node is searched.
Lower bound lb: for the search path U of the current search position, the lower bound is updated according to the following formula, and if the possible lower bound generated by the current search exceeds the lower bound, pruning is directly performed. If the lower bound of all the remaining unsearched positions exceeds the upper bound, the result of the upper bound is directly returned.
Figure BDA0004003322720000082
c is the adjacency matrix of transmission time, cr 1 ]The r < th > of the representation matrix c 1 Row, c [:][r k ]the r < th > of the representation matrix c k Columns.
The method is implemented according to the following algorithm flow:
a) Obtaining an initial upper bound ub by using a genetic algorithm, generating a priority queue Q and adding a starting point into the queue;
b) The following loop is performed when the lower bound of the head of queue point of queue Q is less than the upper bound:
(1) Taking out the queue head element p;
(2) Traversing the rest nodes, if the node is added to the lower bound of the path p and is smaller than or equal to the current upper bound, adding the node to the priority queue Q, and if the leaf node is currently searched and the total consumption of the path is smaller than the current upper bound, updating the upper bound.
After determining the transmission path, sequentially deploying the transducer models into edge devices according to the transmission path, wherein the edge device with the strongest computing capability holds the most transducer layers (i.e. the number of network layers minus the number of devices plus 1), and all other devices hold a layer of network model.
The invention also provides a segmentation device of the transducer model in the mobile edge environment, which comprises:
the communication time determining module is configured to select the number of the mobile edge devices which is not more than the number of layers of the transducer model encoder as a device group, acquire the computing capacity and the transmitting power of the devices, and calculate the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise;
the path calculation module is configured to set up an optimization problem model by taking the most transducer layer held by the edge equipment with the strongest computing capability and the rest equipment with one transducer layer as a segmentation basis and taking the computing time of the minimum transducer model as a target based on the D2D communication time between the equipment, and solve the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path;
and the model segmentation module is configured to place the transducer model on the corresponding mobile edge equipment in a layer segmentation mode based on the obtained optimal transmission path.
It should be understood that the segmentation apparatus for a transducer model in a mobile edge environment in the embodiment of the present invention may implement all the technical solutions in the above method embodiments, and the functions of each functional module may be specifically implemented according to the methods in the above method embodiments, and the specific implementation process may refer to the relevant descriptions in the above embodiments, which are not repeated herein.
The present invention also provides a computer device comprising: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, which when executed by the processors implement the steps of the method for partitioning a transducer model in a mobile edge environment as described above.
The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of a method for partitioning a Transformer model in a mobile edge environment as described above.
The invention also provides a mobile edge computing system, which comprises a plurality of mobile edge devices, wherein the mobile edge devices are distributed with partitioned transducer models based on an optimal transmission path, the number of the mobile edge devices is not more than the number of layers of the transducer models, and the optimal transmission path is obtained based on the following method: selecting the number of mobile edge devices which is not more than the number of layers of a transducer model encoder as a device group, acquiring the computing capacity and the transmitting power of the devices, and calculating the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise; and establishing an optimization problem model by taking the most-capable edge equipment with the highest computing capacity as a division basis and taking the minimum computing time of the transducer model as a target based on the D2D communication time between the equipment, and solving the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path.
The invention models the optimization problem of training a transducer model by using multi-hop segmentation learning in a mobile edge environment, and provides a segmentation method (M-SLT) based on assisted training of the transducer model in the mobile edge environment. The invention divides a transducer model into a plurality of sub-models based on a decentralization architecture, and respectively places the sub-models on a group of edge devices, and intermediate results are exchanged through a D2D link. Compared to client-server architecture, M-SLT improves communication efficiency because D2D communication typically has lower latency and higher data rates than communicating with edge servers over cellular links.

Claims (10)

1. The method for segmenting the transducer model in the mobile edge environment is characterized by comprising the following steps of:
selecting the number of mobile edge devices which is not more than the number of layers of a transducer model encoder as a device group, acquiring the computing capacity and the transmitting power of the devices, and calculating the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise;
the method comprises the steps of establishing an optimization problem model by taking the most-capable edge equipment with the highest computing capacity as a transducer layer and the rest equipment as a segmentation basis and taking the computation time of a minimized transducer model as a target based on D2D communication time between the equipment, and solving the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path;
and based on the obtained optimal transmission path, the transducer model is placed on corresponding mobile edge equipment in a layer division mode.
2. The method of claim 1, wherein the D2D communication time is calculated as follows:
Figure FDA0004003322710000011
wherein t is ij Representing the transmission time of device i to device j, a representing the amount of data transmitted, s ij Representing the data transmission rate from device i to device j, W is the bandwidth of the channel, α is the path loss index, N o Is the noise power, p i Representing the transmit power of device i, d ij Representing the physical distance between device i and device j.
3. The method of claim 2, wherein the modeled optimization problem is as follows:
Figure FDA0004003322710000012
the constraint conditions are as follows:
Figure FDA0004003322710000013
wherein N is the number of mobile edge devices in the device group, and x is the number of mobile edge devices in the device group ij To indicate a variable, a value of 1 indicates that device i transmits data to device j, and a value of 0 indicates that device i does not transmit data to devices j, w i Representing the time, z, required for the device i to calculate the transducer model for each layer i Is a limitationAn integer of connected components.
4. The method of claim 1, wherein obtaining an optimal transmission path using a branch-and-bound algorithm comprises: obtaining a suboptimal solution by using a genetic algorithm as an upper bound of a search algorithm, generating a priority queue and adding a starting node into the queue; the following is then performed when the queue is not empty and the lower bound for the head point is less than the global upper bound: and (3) taking out the queue head element, traversing the rest nodes, adding the nodes into the priority queue if the lower bound obtained by adding the traversed nodes into the path is smaller than or equal to the current upper bound, and updating the upper bound if the leaf nodes are searched and the total consumption of the path is smaller than the current global upper bound.
5. The method of claim 1, wherein placing the fransformer model on the corresponding mobile edge device in layer segmentation comprises: and sequentially deploying the transducer models into the edge devices according to the transmission paths, wherein the edge device with the strongest computing capacity holds the most transducer layers, namely the number of network layers is reduced by the number of devices plus 1, and all other devices hold a layer of network model.
6. A segmentation apparatus for a transducer model in a mobile edge environment, comprising:
the communication time determining module is configured to select the number of the mobile edge devices which is not more than the number of layers of the transducer model encoder as a device group, acquire the computing capacity and the transmitting power of the devices, and calculate the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise;
the path calculation module is configured to set up an optimization problem model by taking the most transducer layer held by the edge equipment with the strongest computing capability and the rest equipment with one transducer layer as a segmentation basis and taking the computing time of the minimum transducer model as a target based on the D2D communication time between the equipment, and solve the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path;
and the model segmentation module is configured to place the transducer model on the corresponding mobile edge equipment in a layer segmentation mode based on the obtained optimal transmission path.
7. A computer device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, which when executed by the processors implement the steps of the method of Transformer model segmentation in a mobile edge environment according to any one of claims 1-5.
8. A computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the method for segmentation of a transducer model in a mobile edge environment according to any of claims 1-5.
9. A mobile edge computing system comprising a number of mobile edge devices, wherein the mobile edge devices are assigned partitioned fransformer models based on an optimal transmission path, wherein the number of mobile edge devices is no greater than the number of fransformer model layers, the optimal transmission path being based on the following method:
selecting the number of mobile edge devices which is not more than the number of layers of a transducer model encoder as a device group, acquiring the computing capacity and the transmitting power of the devices, and calculating the D2D communication time between the devices by combining the distance between the devices, the channel bandwidth and the channel noise; and establishing an optimization problem model by taking the most-capable edge equipment with the highest computing capacity as a division basis and taking the minimum computing time of the transducer model as a target based on the D2D communication time between the equipment, and solving the modeled optimization problem by using a branch delimitation algorithm to obtain an optimal transmission path.
10. The system of claim 9, wherein the modeled optimization problem is as follows:
Figure FDA0004003322710000031
the constraint conditions are as follows:
Figure FDA0004003322710000032
where A represents the amount of data transmitted, W is the bandwidth of the channel, α is the path loss index, N o Is the noise power, p i Representing the transmit power of device i, d ij Representing the physical distance between device i and device j, N is the number of edge devices in the group of devices, x ij To indicate a variable, a value of 1 indicates that device i transmits data to device j, and a value of 0 indicates that device i does not transmit data to devices j, w i Representing the time, z, required for the device i to calculate the transducer model for each layer i Is an integer that limits one connected component.
CN202211624092.1A 2022-12-16 2022-12-16 Method and device for segmenting transducer model in mobile edge environment Pending CN116033492A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211624092.1A CN116033492A (en) 2022-12-16 2022-12-16 Method and device for segmenting transducer model in mobile edge environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211624092.1A CN116033492A (en) 2022-12-16 2022-12-16 Method and device for segmenting transducer model in mobile edge environment

Publications (1)

Publication Number Publication Date
CN116033492A true CN116033492A (en) 2023-04-28

Family

ID=86076898

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211624092.1A Pending CN116033492A (en) 2022-12-16 2022-12-16 Method and device for segmenting transducer model in mobile edge environment

Country Status (1)

Country Link
CN (1) CN116033492A (en)

Similar Documents

Publication Publication Date Title
CN111242282B (en) Deep learning model training acceleration method based on end edge cloud cooperation
CN114756383B (en) Distributed computing method, system, equipment and storage medium
CN108924198B (en) Data scheduling method, device and system based on edge calculation
CN113515370B (en) Distributed training method for large-scale deep neural network
CN109299781B (en) Distributed deep learning system based on momentum and pruning
CN113128702A (en) Neural network self-adaptive distributed parallel training method based on reinforcement learning
CN103699606A (en) Large-scale graphical partition method based on vertex cut and community detection
CN102158417A (en) Method and device for optimizing multi-constraint quality of service (QoS) routing selection
CN108009642A (en) Distributed machines learning method and system
CN113098714A (en) Low-delay network slicing method based on deep reinforcement learning
CN113469325A (en) Layered federated learning method, computer equipment and storage medium for edge aggregation interval adaptive control
CN111585811B (en) Virtual optical network mapping method based on multi-agent deep reinforcement learning
CN113159287B (en) Distributed deep learning method based on gradient sparsity
CN109709985B (en) Unmanned aerial vehicle task optimization method, device and system
CN112862088A (en) Distributed deep learning method based on pipeline annular parameter communication
Xu et al. Living with artificial intelligence: A paradigm shift toward future network traffic control
CN115186806A (en) Distributed graph neural network training method supporting cross-node automatic differentiation
CN108768857B (en) Virtual route forwarding method, device and system
CN109636709A (en) A kind of figure calculation method suitable for heterogeneous platform
CN117311975A (en) Large model parallel training method, system and readable storage medium
CN116033492A (en) Method and device for segmenting transducer model in mobile edge environment
CN116400963A (en) Model automatic parallel method, device and storage medium based on load balancing
CN116545856A (en) Service function chain deployment method, system and device based on reinforcement learning
CN115292044A (en) Data processing method and device, electronic equipment and storage medium
CN113824650B (en) Parameter transmission scheduling algorithm and system in distributed deep learning system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination