CN114144793A - Data transmission method and device, electronic equipment and readable storage medium - Google Patents

Data transmission method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN114144793A
CN114144793A CN201980098672.1A CN201980098672A CN114144793A CN 114144793 A CN114144793 A CN 114144793A CN 201980098672 A CN201980098672 A CN 201980098672A CN 114144793 A CN114144793 A CN 114144793A
Authority
CN
China
Prior art keywords
data
address
transmission
transmitted
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980098672.1A
Other languages
Chinese (zh)
Inventor
何雷骏
董镇江
屠嘉晋
李震桁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN114144793A publication Critical patent/CN114144793A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the application provides a data transmission method, a data transmission device, electronic equipment and a readable storage medium, wherein in the method, at least one to-be-transmitted data is acquired from a storage unit, N source addresses are arranged in the storage unit, the to-be-transmitted data is dispersedly stored in the N source addresses, and the to-be-transmitted data stored in the 1 st source address to the N/2 nd source address is transmitted to a corresponding target address by using a first transmission sub-network based on a first preset relation between the source addresses and the target address. The first preset relationship includes: when the source address is K, the corresponding destination address is one of 0 to K starting from 0. The first transport subnetwork comprises a plurality of layers, no switching node being present from the 2 < Lambda > (Y-1) +1 position to the 2 < Lambda > Y position of layer Y, and each of the at least one switching node comprising no upstream connection line when at least one switching node is present from the 1 < Lambda > (Y-1) +1 position to the 2 < Lambda > Y position in layer Y. The method can greatly reduce the transmission overhead and the calculation overhead and greatly improve the processing efficiency of the data with sparsity.

Description

Data transmission method and device, electronic equipment and readable storage medium Technical Field
Embodiments of the present disclosure relate to computer technologies, and in particular, to a data transmission method and apparatus, an electronic device, and a readable storage medium.
Background
In some fields involving computational processing of data, the data may be characterized by sparsity. Taking a neural network involved in data calculation processing as an example, the neural network has a sparsity ratio prevailing in its feature map and parameters. Wherein, the feature map may have a sparsity ratio of 20% to 80%, and the parameter may have a sparsity ratio of 50% to 90%. The higher the sparsity ratio, the more 0-value data in the data is represented, and these 0-value data do not contribute to the final calculation result. Therefore, the transmission and calculation of these 0-value data belong to invalid operations. In a processor for performing data calculation processing, data may be stored in a storage medium, and when data calculation processing is required, the data needs to be transmitted from the storage medium to a calculation module of the processor for calculation processing. If the aforementioned 0 value data is processed as other non-0 value data, the 0 value data needs to be transmitted from the storage medium to the calculation module, and the calculation module needs to perform calculation processing on the 0 value data, which causes a large transmission overhead as well as calculation overhead.
Therefore, how to transmit and calculate data with sparsity to reduce invalid operations of transmission and calculation of 0-value data and reduce transmission overhead and calculation overhead is a problem to be solved urgently.
Disclosure of Invention
The embodiment of the application provides a data transmission method, a data transmission device, electronic equipment and a readable storage medium, which are used for reducing the transmission overhead and the calculation overhead of data in the electronic equipment.
In a first aspect, an embodiment of the present application provides a data transmission method, where at least one to-be-transmitted data is first obtained from a storage unit, where the storage unit is provided with N source addresses, and the to-be-transmitted data is stored in the N source addresses in a scattered manner, and further, based on a first preset relationship between the source addresses and destination addresses, the to-be-transmitted data stored in the 1 st source address to the N/2 nd source address is transmitted to the corresponding destination addresses using a first transmission subnetwork. Wherein the first preset relationship comprises: when the source address is K, the corresponding destination address is one of 0 to K starting from 0. In addition, the first transport subnetwork mentioned above comprises a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2^ (Y-1) +1 position to the 2^ Y position of layer Y, and each of the at least one switching node not comprising an uplink line when at least one switching node is present from the 1 st position to the 2^ Y position in layer Y.
In the method, based on a first preset relation satisfied between a source address and a target address, a transmission network for transmitting data between the source address and the target address is provided, in a first transmission sub-network of the transmission network, no switching node exists from a 2^ (Y-1) +1 position to a 2^ Y position of a layer Y, and when at least one switching node exists from the 1 st position to the 2^ Y position of the layer Y, each switching node in the at least one switching node does not include an uplink line, and when data is transmitted through the transmission network, no collision situation occurs. Meanwhile, compared with the traditional network which does not collide, the number of the switching nodes of the transmission network is obviously reduced, and the complexity of the transmission network is obviously reduced. Therefore, the transmission network has the advantages of high transmission speed and less occupied transmission resources. When the transmission network is used for transmitting sparse data, the transmission overhead and the calculation overhead can be greatly reduced, and the processing efficiency of the sparse data is greatly improved.
In an optional implementation manner, the method further includes:
and based on a second preset relation between the source address and the target address, transmitting the data to be transmitted stored in the (N/2 + 1) th source address to the Nth source address to the corresponding target address by using a second transmission sub-network. Wherein the second predetermined relationship comprises: when the source address is L, the corresponding destination address is one of M-1 to M-1- [ L% (N/2) ] starting from M-1, M is the number of destination addresses, and M is smaller than N. In addition, the second sub-network of transmissions comprises a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2^ (S-1) +1 position to the 2^ S position of layer S, and each of the at least one switching node not comprising an uplink connection line when at least one switching node is present from the 1 st position to the 2^ S position in layer S.
In this way, a transport network for transmitting data between a source address and a destination address is proposed based on a second predetermined relationship satisfied between the source address and the destination address, in a second transport subnetwork of the transport network, no switching node is present from the 2^ (S-1) +1 position to the 2^ S position of the layer S, and when at least one switching node is present from the 1 st position to the 2^ S position in the layer S, each switching node of the at least one switching node does not include an uplink connection line. When data is transmitted through the transmission network, collision is not generated. Meanwhile, compared with the traditional network which does not collide, the number of the switching nodes of the transmission network is obviously reduced, and the complexity of the transmission network is obviously reduced. Therefore, the transmission network has the advantages of high transmission speed and less occupied transmission resources. When the transmission network is used for transmitting sparse data, the transmission overhead and the calculation overhead can be greatly reduced, and the processing efficiency of the sparse data is greatly improved.
In an alternative implementation, the number of layers of the first transmission subnetwork is log2(N) +1, and/or the number of layers of the second transmission subnetwork is log2(N)+1。
In an optional implementation manner, when the first transmission sub-network is used to transmit data to be transmitted, which is stored in the 1 st to N/2 nd source addresses, to the corresponding destination addresses, the destination address corresponding to the data to be transmitted, which is stored in the 1 st to N/2 th source addresses, may be first obtained, and the destination address is represented by a binary number, and then, starting from the LSB of the destination address, a transmission path of the data to be transmitted in the first transmission sub-network is determined according to values on bits in the destination address, and the data to be transmitted is transmitted to the destination address through the transmission path in the first transmission sub-network.
In an optional implementation manner, when the second transmission sub-network is used to transmit the data to be transmitted, which is stored in the N/2+1 th to nth source addresses, to the corresponding destination address, the destination address corresponding to the data to be transmitted, which is stored in the N/2+1 th to nth source addresses, may be first obtained, the destination address is represented by using a binary number value, and then, starting from the LSB of the destination address, a transmission path of the data to be transmitted in the second transmission sub-network is determined according to a value on each bit in the destination address, and the data to be transmitted is transmitted to the destination address through the transmission path in the second transmission sub-network.
In the two optional modes, the transmission network is used for routing the data to be transmitted to the target address according to the LSB, so that the data transmission speed can be further improved.
In an optional implementation manner, the target address is an address in a calculation module, and the calculation module at least includes M addresses.
In an optional implementation manner, before transmitting data to be transmitted, which is stored in the 1 st to N/2 th source addresses, to corresponding destination addresses by using a first transmission subnetwork based on a first preset relationship between the source addresses and the destination addresses, it may be first determined whether the number of the data to be transmitted is greater than or equal to M, if the number of the data to be transmitted is greater than M, at least one piece of data to be transmitted is divided into a plurality of groups of sub-data, and each group of sub-data is transmitted under one transmission clock.
In the method, when the number of the data to be transmitted is larger than that of the target addresses, the data to be transmitted is divided into a plurality of groups of subdata, and each group of subdata is transmitted under different clocks, so that the conflict between data transmission and operation is avoided, and the correctness of the data transmission and operation is ensured.
In an alternative implementation, N is 8 and M is 4.
In a second aspect, an embodiment of the present application provides a data transmission apparatus, including: the device comprises a storage unit, a target module, a transmission network and a control module.
N source addresses are set in the storage unit, and a plurality of target addresses are set in the target module.
The transmission network is connected with the storage unit and the target module respectively.
The transport network comprises a first transport subnetwork comprising a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2 < Lambda > (Y-1) +1 position to the 2 < Lambda > Y position of layer Y, and each of the at least one switching node not comprising an uplink line when at least one switching node is present from the 1 < Lambda > (Y-1) + 1) position to the 2 < Lambda > Y position in layer Y.
The control module is used for acquiring at least one to-be-transmitted data from the storage unit, the to-be-transmitted data is dispersedly stored in the N source addresses, and the to-be-transmitted data stored in the 1 st source address to the N/2 nd source address is transmitted to the corresponding target address by using a first transmission sub-network based on a first preset relation between the source address and the target address, wherein the first preset relation comprises: when the source address is K, the corresponding destination address is one of 0 to K starting from 0.
In an alternative implementation, the transport network further comprises a second transport subnetwork.
The second sub-network of transmissions comprises a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2^ (S-1) +1 position to the 2^ S position of layer S, and each of the at least one switching node not comprising an uplink connection line when at least one switching node is present from the 1 st position to the 2^ S position in layer S.
The control module is further configured to transmit data to be transmitted, which is stored in the (N/2 + 1) th source address to the nth source address, to a corresponding destination address using a second transmission subnetwork based on a second preset relationship between the source address and the destination address, where the second preset relationship includes: when the source address is L, the corresponding destination address is one of M-1 to M-1- [ L% (N/2) ] starting from M-1, M is the number of destination addresses, and M is smaller than N.
In an alternative implementation, the number of layers of the first transmission subnetwork is log2(N) +1, and/or the number of layers of the second transmission subnetwork is log2(N)+1。
In an optional implementation manner, the control module is specifically configured to:
acquiring a target address corresponding to the transmission of the data to be transmitted stored in the 1 st source address to the N/2 nd source address, wherein the target address is represented by binary number values; and determining a transmission path of the data to be transmitted in the first transmission sub-network according to the value of each bit in the target address from the LSB of the target address, and transmitting the data to be transmitted to the target address through the transmission path in the first transmission sub-network.
In an optional implementation manner, the control module is specifically configured to:
acquiring a target address corresponding to data to be transmitted, which are stored in the (N/2 + 1) th source address to the Nth source address, wherein the target address is represented by binary number values; and determining a transmission path of the data to be transmitted in the second transmission sub-network according to the value of each bit in the target address from the LSB of the target address, and transmitting the data to be transmitted to the target address through the transmission path in the second transmission sub-network.
In each of the above optional implementations, the target module is a calculation module, and the calculation module at least includes M addresses.
In an alternative implementation, the control module is further configured to:
when the number of the data to be transmitted is larger than M, dividing at least one data to be transmitted into a plurality of groups of subdata, and transmitting each group of subdata under one transmission clock.
In an alternative implementation, N is 8 and M is 4.
In a third aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor.
The processor is configured to be coupled to the memory, read and execute instructions in the memory, so as to implement the method steps of the first aspect.
In a fourth aspect, the present application provides a computer program product, wherein the computer program product includes computer program code, and when the computer program code is executed by a computer, the computer is caused to execute the method of the first aspect.
In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores computer instructions, and when the computer instructions are executed by a computer, the computer is caused to execute the instructions of the method according to the first aspect.
In a sixth aspect, an embodiment of the present application provides a chip, where the chip is connected to a memory, and is configured to read and execute a software program stored in the memory, so as to implement the method provided in the first aspect.
Drawings
FIG. 1 is a diagram illustrating an exemplary process of performing convolution operation on a section of parameter (weight) and a section of feature map (feature map) in a neural network;
fig. 2 is a schematic flowchart of a data transmission method according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a conventional butterfly network;
FIG. 4 is a schematic structural diagram of a reverse butterfly network;
FIG. 5(a) shows the evolution of the molecular network of the first half;
fig. 5(b) is a diagram of a transmission network structure after evolution;
FIG. 6(a) is the evolution process of the latter half of the sub-network;
FIG. 6(b) is a network architecture diagram after evolution;
FIG. 7 is a schematic diagram of a network structure obtained by using the two-part sub-network evolution method shown above;
fig. 8 is a schematic flowchart of a data transmission method according to an embodiment of the present application;
fig. 9 is a block diagram of a data transmission device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
First, the data sparsity ratio will be described by way of an example.
Fig. 1 is a diagram illustrating an exemplary process of performing convolution operation on a parameter (weight) and a feature map (feature map) in a neural network, as shown in fig. 1, where the neural network includes a parameter and a feature map. The piece of parameter is composed of a plurality of data, and partial data in the plurality of data is 0. The segment of feature map is composed of a plurality of data, and partial data in the plurality of data is 0. When the convolution operation is performed on the section of parameter and the section of feature map, feature map data and parameter data corresponding to the same corner mark need to be multiplied, and then multiplication results are accumulated. Since both a section of feature map and a section of parameter contain 0, only the data corresponding to the corner mark 2 and the corner mark 7 are non-0 values after multiplication, the data contribute to the final operation result, and the data corresponding to the other corner marks do not contribute to the final operation result. In the example shown in fig. 1, the data corresponding to the corner marks 2 and 7 contribute to the final operation result, and the data corresponding to the remaining corner marks 0, 1, 3, 4, 5, and 6 do not contribute to the final operation result, i.e., the data sparsity ratio is 75%.
In an electronic device that performs arithmetic processing, data is first stored in a storage medium before arithmetic processing, and further, when arithmetic processing is performed, the data needs to be transmitted to a calculation module for arithmetic processing. After the calculation module performs the calculation, the calculation result may need to be sent to the next calculation module, and so on.
For convenience of description, in the embodiment of the present application, an address of data before the save operation is referred to as a source address, and an address of data in the calculation module when the operation processing is performed is referred to as a destination address. Data needs to be transmitted from a source address to a destination address and subjected to arithmetic processing.
It should be noted that, in the embodiment of the present application, the source address may refer to an address in a storage medium, for example, an address in a storage medium such as a Static Random Access Memory (SRAM) or a Dynamic Random Access Memory (DRAM), or the source address may also refer to an address in a computing module. The target address may refer to an address in the computing module.
In addition, in the embodiment of the present application, "data" refers to data that can be used for calculation, such as a half-precision floating point number, a full-precision floating point number, and an integer, "data" may be represented by decimal or binary, and the specific representation mode of the data in the embodiment of the present application is not particularly limited. Taking the parameters shown in fig. 1 as an example, a segment of parameters is composed of 8 data, i.e., 0, 1, 0, -1, and each data is an integer and is expressed by decimal.
In the example of fig. 1, a piece of parameter and a piece of feature map may be respectively referred to as a data sequence, and the data sequence is uniformly transmitted from a source address to a destination address in the calculation processing. Specifically, one data in the data sequence is stored in a source address, one source address corresponds to one destination address, and the data in each source address is transmitted to the corresponding destination address.
In one possible design, before the data sequence is transmitted from the source address to the destination address, first, valid data in the data sequence is screened, where the valid data may refer to data in the data sequence that contributes to the operation result, and after the valid data is screened, the valid data is transmitted through a transmission network between a source module at the source address and a destination module at the destination address. The source module comprises a plurality of addresses, and the target module comprises a plurality of addresses. The data stored in each address of the source module is transmitted to each address of the target module through a transmission network between the source module and the target module. For example, assuming that the source module is an SRAM, the target module is a certain calculation module a, the SRAM has 8 addresses, and the calculation module a has 4 addresses, all data stored in the 8 addresses in the SRAM may be transmitted to the 4 addresses in the calculation module a through a transmission network between the SRAM and the calculation module a. On the basis of screening out effective data, if a transmission network with high transmission speed and less occupied transmission resources can be provided, the transmission overhead and the calculation overhead of data with sparsity can be greatly reduced.
The following embodiments of the present application are directed to provide a data transmission method based on a transmission network with a fast transmission speed and a small occupation of transmission resources, so that transmission overhead and calculation overhead can be greatly reduced when data with sparsity is transmitted based on the network.
The method may be applied to any electronic device comprising a storage medium and a computing module. Illustratively, the electronic device may be a communication device such as a terminal device or a network device, or the electronic device may also be a server or the like.
Taking an electronic device as an example of a Terminal device, the Terminal device may also be referred to as a Terminal, a User Equipment (UE), a Mobile Station (MS), a Mobile Terminal (MT), and the like. The terminal device may be a mobile phone (mobile phone), a tablet computer (pad), a computer with wireless transceiving function, a Virtual Reality (VR) terminal device, an Augmented Reality (AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in self driving (self driving), a wireless terminal in remote surgery (remote medical supply), a wireless terminal in smart grid (smart grid), a wireless terminal in transportation safety (transportation safety), a wireless terminal in smart city (smart city), a wireless terminal in smart home (smart home), and the like.
Taking an electronic device as an example, the network device may be a base station, such as a Base Transceiver Station (BTS) in a global system for mobile communication (GSM) or Code Division Multiple Access (CDMA), a base station (NodeB) in a Wideband Code Division Multiple Access (WCDMA), an evolved Node B (eNB or e-NodeB) in LTE, or a gNB in NR. The base station may also be a wireless controller in a Cloud Radio Access Network (CRAN) scenario, or may be a relay station, an access point, a vehicle-mounted device, a wearable device, and a network device in a 5G network or a network device in a PLMN network that evolves in the future, and the like.
Fig. 2 is a schematic flow chart of a data transmission method provided in an embodiment of the present application, and as shown in fig. 2, the method may include:
s201, at least one data to be transmitted is acquired from a storage unit, N source addresses are set in the storage unit, and the data to be transmitted is stored in the N source addresses in a scattered mode.
The at least one piece of data to be transmitted may be data in a data sequence, and the data sequence may be any data sequence that needs to be transmitted to the computing module in the electronic device for computing, and may be, for example, a parameter or a feature diagram illustrated in fig. 1.
In the embodiment of the present application, the computing module may also be referred to as a computing unit.
Optionally, the storage unit may be an SRAM, a DRAM, or the like, where the storage unit includes a plurality of source addresses, and each source address may store one to-be-transmitted data.
Optionally, the data to be transmitted may be valid data in a data sequence. The electronic device may pre-tag valid data in the data sequence before saving the data to be transmitted to the storage unit. For example, the electronic device may determine valid data in the data sequence and the other data according to another data sequence operated on by the data sequence and an operation manner of the data sequence and the other data sequence, and mark the valid data. Taking an example that a data sequence is a section of 0-containing parameter in the neural network in fig. 1, and another data sequence is a section of 0-containing feature map in the neural network in fig. 1, the electronic device first reads the section of parameter and the section of feature map that needs to be operated with the section of parameter, and knows that the section of parameter and the section of feature map need to be multiplied, and then, the electronic device marks data with a multiplication result that is not 0 in the section of feature map and the section of parameter as valid data. Specifically, the data corresponding to the corner marks 2 and 7 in fig. 1 are marked as valid data, that is, the valid data is 1 and-1 in a section of parameters of the neural network illustrated in fig. 1, and meanwhile, the valid data is 3 and 5 in a section of data of the neural network illustrated in fig. 1. After the valid data is marked, the valid data is stored in the N source addresses of the storage unit in a scattered manner.
Optionally, the N source addresses may be all addresses in a module that stores data to be transmitted, or the N source addresses may also be partial addresses in a module that stores data to be transmitted, which is not specifically limited in this embodiment of the present application.
Optionally, N is an even number.
With continued reference to the example of FIG. 1, a piece of parameter includes 8 data, respectively 0, 1, 0, -1, the 8 data being stored in 8 source addresses of the storage unit, respectively. In this example, N is 8.
S202, based on a first preset relationship between source addresses and target addresses, transmitting data to be transmitted, which are stored in the 1 st source address to the N/2 nd source address, to corresponding target addresses by using a first transmission sub-network, wherein the first preset relationship comprises: when the source address is K, the corresponding destination address is one of 0 to K starting from 0.
Optionally, the at least one to-be-transmitted data is respectively transmitted to a target address in the computing module. The calculation module where the target address is located may include at least M addresses. The data in the N source addresses are transmitted to the M addresses of the computation module. In this embodiment of the present application, M is smaller than N, that is, the number of data calculated by the calculation module at a single time is smaller than the number of data stored in the module where the source address is located, so as to align valid data.
When the data of the N source addresses is transmitted to the M addresses of the computing module, on one hand, a first source address of the N source addresses and a target address corresponding to the first source address satisfy a first preset relationship, where the first preset relationship includes: when the source address is K, the destination address is one of 0 to K starting from 0. Wherein K is a number of 0 or more. The first source address is any one of the 1 st source address to the N/2 nd source address.
Illustratively, the mapping relationship between the source address and the destination address can be expressed by way of table entry management, in a conventional network such as a Crossbar network, the mapping relationship between the source address and the destination address is a full connection relationship, that is, for one source address, the stored data may be transmitted to any destination address, whereas in the embodiment of the present application, for one source address K from a 1 st source address to an N/2 nd source address, the corresponding destination address is no longer any destination address, but one of 0 to K. The design can simplify the complexity of a transmission network on the premise of ensuring normal data transmission.
It is to be noted that, for convenience of description, it is assumed in the embodiment of the present application that the source address and the destination address are numbered from 0, and thus, one of 0 to K denotes the first destination address to the K +1 th destination address. For example, assume that the target address is an address in the computing module, and the computing module includes M addresses, which are numbered from 0, so that target address 0 represents the first target address in the computing module, and target address M-1 represents the mth target address in the computing module.
For the 1 st source address to the N/2 th source address in the module storing the data to be transmitted, that is, the first half source address in the module storing the data to be transmitted, the data held in these addresses is transmitted to the address starting from 0 in the calculation module, and the destination address to which the data in the source address K is transmitted is one of 0 to K. For the first half of source addresses, the corresponding destination addresses are arranged in the forward direction.
The source address is transmitted between the module and the calculation module through a specific transmission network.
In an embodiment of the application, the above-mentioned transport network includes a first transport sub-network, the first transport sub-network is configured to transport data to be transported stored in the 1 st to N/2 nd source addresses, the first transport sub-network includes a plurality of layers, each layer includes at least one switching node, no switching node exists from the 2^ (Y-1) +1 position to the 2^ Y position of the layer Y, and each switching node of the at least one switching node does not include an uplink line when at least one switching node exists from the 1 st to the 2^ Y position of the layer Y.
In the embodiment of the present application, the symbol "^" represents a power operation, for example, 2^ Y represents a power of Y of 2, and the following description is not separately explained.
It is worth noting that in the embodiments of the present application, the switching node may be a logic device implemented by circuit logic. Illustratively, the switching node may be a 2-2 Multiplexer (MUX) or the like.
In the above transmission network, the first transmission sub-network is used to transmit the data to be transmitted stored in the 1 st source address to the N/2 nd source address, that is, the first transmission sub-network is used to transmit the data to be transmitted in the first half source addresses. In the embodiment of the application, the number of layers of the transmission network can be flexibly set. As an alternative embodiment, the number of layers of the transport network may be determined according to the number of source addresses. When the number of source addresses is N as described above, the number of layers of the transport network may be log2(N) +1 rounded up.
In a specific implementation process, the transmission network of the embodiment of the present application may be evolved based on a conventional transmission network and on the basis of the conventional transmission network.
The characteristics of the transmission network according to the embodiment of the present application will be described below by taking a conventional butterfly network as an example.
Fig. 3 is a schematic structural diagram of a conventional button network, and as shown in fig. 3, the button network is responsible for transmitting data of 8 source addresses to 4 destination addresses, and data in different source addresses may need to be transmitted by using the same transmission line, which may cause a collision phenomenon. For example, data stored in the source address 0 and data stored in the address 4 may need to be transmitted simultaneously using the transmission line between the node 1 and the node 2, thereby causing a collision.
Based on the conventional button network shown in fig. 3, in the embodiment of the present application, a reverse button network structure is first proposed, fig. 4 is a schematic structural diagram of the reverse button network, and as shown in fig. 4, the network includes two transmission sub-networks, one transmission sub-network (the first half transmission sub-network) is responsible for transmitting data in the source address of the first half to the destination address, and the other transmission sub-network (the second half transmission sub-network) is responsible for transmitting data in the source address of the second half to the destination address. Wherein, the first half source address and the second half source address respectively refer to: assume that the network includes N source addresses, the first half of which refers to source addresses 0 through N/2-1 and the second half of which refers to source addresses N/2 through N-1. Both transmission sub-networks comprise a plurality of layers, each layer comprising a plurality of switching nodes. Each node of the first layer of each transport subnetwork is connected to each source address and each node of the last layer of each transport subnetwork is connected to each destination address. It is worth noting that in the network architecture shown in fig. 4, for switching node a, switching node B, switching node C and switching node D connected to the destination address, they belong to two transmission sub-networks simultaneously.
Based on the network structure shown in fig. 4, the first half of the transmission sub-networks may be evolved based on the first preset relationship, so as to obtain the transmission network described in step S202. Fig. 5(a) shows the evolution process of the first half molecular network, and fig. 5(b) shows the structure of the transmission network after the evolution. As shown in fig. 5(a) and 5(b), the transmission network includes a first transmission sub-network and a second transmission sub-network. The first transmission sub-network is responsible for transmitting data in the first half source address to the target address, and the second transmission sub-network is responsible for transmitting data in the second half source address to the target address. Both sub-networks comprise a plurality of layers, each layer comprising a plurality of switching nodes. Each node of the first layer of each transmission subnetwork is connected to each source address, and each node of the last layer of each subnetwork is connected to each destination address. It is to be noted that in the network configuration shown in fig. 5(b), the switching node connected to the destination address belongs to both sub-networks. Meanwhile, based on the first preset relationship, the first half molecular network in the inverse butterfly network shown in fig. 4 may be obtained through evolution.
For layer Y in the first transmission subnetwork, the evolution is such that Y is greater than or equal to 0 and less than or equal to the difference of the number of layers of the transmission network minus 1, e.g. the number of layers of the first transmission subnetwork may be log2When the result of rounding up (N) +1, the value of Y is: 0 or more and log (N) or less.
The number of layers of the first transmission subnetwork and the number of layers of the second transmission subnetwork are respectively the same as the number of layers of the transmission network. (1) In layer Y, the uplink connection line from switch node 0 to switch node 2^ Y-1 is omitted.
Optionally, the switching nodes in each layer of the first transmission subnetwork may be numbered as follows:
A. the switching node's sequence number starts with 0. For example, switching node 0 represents the 1 st switching node, and switching node 2^ Y-1 represents the 2^ Y switching node.
B. In layer 0, the number of switching nodes connected to the smallest source address is the smallest, and so on. For example, in the first transport subnetwork shown in fig. 5(a), the switching node in layer 0 connected to the source address 0 is switching node 0, the switching node in layer 0 connected to the source address 1 is switching node 1, and so on.
C. In the other layers than layer 0 and the last layer of the first transport subnetwork, the number of each switching node is in agreement with the number of the switching node in the first layer, respectively, which is located at the same position as each switching node. Illustratively, the layer 1 includes 4 switching nodes, and the lowest switching node is the same as the switching node 0 in the layer 0 in position, that is, both belong to the lowest switching node of the layer where the switching node is located, and then the lowest switching node in the layer 1 is the switching node 0. The next lower switching node is in the same position as the switching node 1 in the layer 0, i.e. both belong to the switching node below the layer, and therefore the next lower switching node in the layer 1 is the switching node 1. By analogy, the number of each switching node in the other layers than layer 0 and the last layer of the first transport subnetwork can be derived.
D. In the last layer, the number of the switching node connected to the smallest destination address is the smallest, and so on. For example, in the first transport subnetwork shown in fig. 5(a), the switching node in layer 3 connected to destination address 0 is switching node 0, the switching node in layer 3 connected to destination address 1 is switching node 1, and so on.
In addition, in the embodiment of the present application, the serial numbers of the source address and the destination address are also numbered from 0. For example, source address 0 represents the 1 st source address, and so on.
Referring to fig. 4, 5(a) and 5(b), taking Y as 1 as an example, since in the first preset relationship, the data in the source address K can only be transmitted to one of the destination addresses 0 to K, and for layer 1, the data in the source address does not need to be transmitted upward when passing through the switching node 0 or the switching node 1 in layer 1, and therefore, after omitting the uplink connection lines of the switching node 0 and the switching node 1 in layer 1, the normal transmission of the data in the source address is not affected.
Here, the upward transmission refers to the transmission of data from the switch node with the smaller number to the switch node with the larger number, and for example, referring to fig. 5(a), when the switch node 0 of layer 1 transmits data to the switch node 2 of layer 2, the data is upward transmission.
Correspondingly, the uplink connection line refers to a connection between a switch node with a smaller number in a layer with a smaller number to a switch node with a larger number in a layer with a larger number. Illustratively, for layers 1 and 2, layer 1 is the lower numbered layer and layer 2 is the higher numbered layer. For the switching node 0 in layer 1 and the switching node 2 in layer 2, the switching node 0 in layer 1 is the switching node with the smaller number, and the switching node 2 in layer 2 is the switching node with the larger number, so the connection between the switching node 0 in layer 1 and the switching node 2 in layer 2 is an uplink connection line.
(2) In layer Y, switch nodes from position 2^ (Y-1) +1 to position 2^ Y are deleted.
Wherein the layer Y is a layer other than the first and last layer in the first transport subnetwork.
It is worth noting that after deleting the switching node from the 2^ (Y-1) +1 position to the 2^ Y position, the 2^ (Y-1) +1 position to the 2^ Y position still exist, and the switching node no longer exists at these positions.
The deleted switching nodes in the step include 2 × 2 switching nodes and 2 × 1 switching nodes. Wherein, a 2x2 switching node refers to a node comprising 2 input connections and 2 output connections, and a 2x1 switching node refers to a node comprising 2 input connections and 1 output connection.
With continuing reference to fig. 4, 5(a) and 5(b), taking Y as 1 as an example, after the above (1) is performed, the switching node 1 in layer 1 is only used to connect the switching node 1 in layer 0 and the switching node 1 in layer 2, so after the switching node 1 in layer 1 is deleted, the switching node 1 in layer 0 and the switching node 1 in layer 2 are directly connected without affecting the normal transmission of data in the source address. According to the principle, the switching node 2 and the switching node 3 of the layer 2 are also deleted, and the switching node 3 of the layer 1 is directly connected with the switching node 3 of the layer 3; switching node 2 of layer 1 and switching node 2 of layer 3 are connected directly.
(3) When Y > 1, modify switching node 0 to switching node 2^ (Y-1) -1 from the 2x2 node to the 2x1 node or the 1x2 node.
Alternatively, this step may be performed independently of the above (1) and (2), or the result of this step may be satisfied if the above (1) and (2) are performed.
After the above evolution, the Y layer of the resulting transmission network satisfies the following conditions:
(1) there are no switching nodes at the 2^ (Y-1) +1 th position to the 2^ Y position of layer Y.
(2) When there is at least one switching node at the 1 st to 2 < Lambda > Y positions in layer Y, each of the at least one switching node does not include an uplink line.
With reference to fig. 4, 5(a) and 5(b), in the first transport sub-network of the transport network, the switching node 1 on layer 1, the switching node 2 and the switching node 3 on layer 2 are deleted after the above evolution.
Specifically, the transport network in fig. 5(b) is configured to transmit data in 8 source addresses to 4 destination addresses, and a first transport sub-network in the transport network is configured to transmit data in source addresses 0 to 3, where the first transport sub-network includes 4 layers, i.e., layer 0, layer 1, layer 2, and layer 3, and layer 0 includes 4 switching nodes, i.e., switching node 0, switching node 1, switching node 2, and switching node 3. Layer 1 includes 3 switching nodes, node 0, node 2, and node 3. Layer 2 includes 2 switching nodes, node 0 and node 1. Layer 3 includes 4 switching nodes. The connection mode of each switching node in each layer can refer to fig. 5(b), and is not described one by one here.
In the embodiment of the present application, the source address and the destination address satisfy the first predetermined relationship, so that when data in the first half of the source addresses is transmitted through the first transmission sub-network shown in fig. 5(b), no collision occurs, and at the same time, compared with a conventional non-collision transmission network, such as a Crossbar network, the number of switching nodes of the transmission network is significantly reduced, and the complexity of the transmission network is significantly reduced.
In this embodiment, a transmission network for transmitting data between a source address and a destination address is provided based on a first predetermined relationship satisfied between the source address and the destination address, in a first transmission sub-network of the transmission network, no switching node exists from a position 2^ (Y-1) +1 to a position 2^ Y of a layer Y, and when at least one switching node exists from the position 1 to the position 2^ Y of the layer Y, each switching node of the at least one switching node does not include an uplink line, and when data is transmitted through the transmission network, a collision condition is not generated. Meanwhile, compared with the traditional network which does not collide, the number of the switching nodes of the transmission network is obviously reduced, and the complexity of the transmission network is obviously reduced. Therefore, the transmission network has the advantages of high transmission speed and less occupied transmission resources. When the transmission network is used for transmitting sparse data, the transmission overhead and the calculation overhead can be greatly reduced, and the processing efficiency of the sparse data is greatly improved.
As an optional implementation manner, in the aforementioned N source addresses, a second source address and a destination address corresponding to the second source address satisfy a second preset relationship, where the second preset relationship includes: when the source address is L, the destination address is one of M-1 to M-1- [ L% (N/2) ] starting from M-1, wherein the second source address is any one of the N/2+1 th source address to the N-th source address. L is a number greater than 0.
For the source addresses from the (N/2 + 1) th source address to the (N) th source address in the module for storing the data to be transmitted, namely the source addresses of the second half part in the module for storing the data to be transmitted, the data stored in the addresses are transmitted to the addresses starting from M-1 in the calculation module, and the target address to which the data in the source address L is transmitted is one of M-1 to M-1- [ L% (N/2) ]. For the second half source address, the corresponding target address is in a reverse arrangement mode.
In an embodiment of the application, the transport network further includes a second transport subnetwork for transporting data to be transported stored in the N/2+1 to nth source addresses, the second transport subnetwork includes a plurality of layers, each layer includes at least one switching node, no switching node exists from the 2^ 1) +1 position to the 2^ S position of the layer S, and when at least one switching node exists from the 1 st position to the 2^ S position in the layer S, each switching node of the at least one switching node does not include an uplink connection line.
Based on the second predetermined relationship, the data to be transmitted stored in the N/2+1 th source address to the nth source address may be transmitted to the corresponding destination address using the second transmission subnetwork.
Based on the second predetermined relationship, the second transmission sub-network in the transmission network can be evolved by using the above evolution process based on the network structure shown in fig. 4. Fig. 6(a) shows the evolution process of the latter half of the sub-network, and fig. 6(b) shows the structure of the network after the evolution. Based on the second preset relationship, the following evolution process is performed on the second half sub-network of the inverse butterfly network shown in fig. 4. It should be noted that in fig. 6(a) and 6(b), the second transmission subnetworks are first connected in the reverse arrangement, specifically, node 0 of layer 2 is connected to node 3 of layer 3, node 1 of layer 2 is connected to node 2 of layer 3, and so on.
For the layers S in the second transmission sub-network, the evolution is such that S is greater than or equal to 0 and less than or equal to the difference of the number of layers of the second transmission sub-network minus 1, e.g. the number of layers of the second transmission sub-network may be log2When the result of rounding up (N) +1, the value of S is: 0 or more and log or less2(N) result of rounding up.
(1) In layer S, the uplink connection from switch node 0 to switch node 2^ S-1 is omitted.
Optionally, the switching nodes in each layer of the second transmission subnetwork may be numbered as follows:
A. the switching node's sequence number starts with 0. For example, switching node 0 represents the 1 st switching node, and switching node 2^ Y-1 represents the 2^ Y switching node.
B. In layer 0, the number of switching nodes connected to the smallest source address is the smallest, and so on. For example, in the second transmission subnetwork shown in fig. 5(a), the switching node in layer 0 connected to the source address 0 is switching node 0, the switching node in layer 0 connected to the source address 1 is switching node 1, and so on.
C. In the other layers than layer 0 and the last layer of the second transmission subnetwork, the number of each switching node is in agreement with the number of the switching node in the first layer, which is located at the same position as the switching node. Illustratively, the layer 1 includes 4 switching nodes, and the lowest switching node is the same as the switching node 0 in the layer 0 in position, that is, both belong to the lowest switching node of the layer where the switching node is located, and then the lowest switching node in the layer 1 is the switching node 0. The next lower switching node is in the same position as the switching node 1 in the layer 0, i.e. both belong to the switching node below the layer, and therefore the next lower switching node in the layer 1 is the switching node 1. By analogy, the number of each switching node in the other layers than layer 0 and the last layer of the second transmission subnetwork can be derived.
D. In the last layer, the number of the switching node connected to the smallest destination address is the smallest, and so on. For example, in the second transmission subnetwork shown in fig. 5(a), the switching node in layer 3 connected to destination address 0 is switching node 0, the switching node in layer 3 connected to destination address 1 is switching node 1, and so on.
Referring to fig. 4, fig. 6(a) and fig. 6(b), taking S as an example, since in the second preset relationship, the data in the source address L can only be transmitted to one of the destination addresses M-1 to M-1- [ L% (N/2) ], the data in the source address does not need to be transmitted upward when passing through the switching node 0 or the switching node 1 in the layer 1 for the layer 1, and therefore, after omitting the uplink lines of the switching node 0 and the switching node 1 in the layer 1, the normal transmission of the data in the source address is not affected.
(2) In layer S, switch nodes from position 2^ (S-1) +1 to position 2^ S are deleted.
Wherein the layer S is a layer other than the first layer and the last layer in the second transmission subnetwork.
It is worth noting that after deleting the switching nodes from the 2^ (S-1) +1 position to the 2^ S position, the 2^ (S-1) +1 position to the 2^ S position still exist, and the switching nodes no longer exist at these positions.
The deleted switching nodes in the step include 2 × 2 switching nodes and 2 × 1 switching nodes.
With continuing reference to fig. 4, fig. 6(a) and fig. 6(b), taking S as an example, after the above (1) is performed, the switching node 1 in layer 1 is only used to connect the switching node 1 in layer 0 and the switching node 1 in layer 2, so after the switching node 1 in layer 1 is deleted, the switching node 1 in layer 0 and the switching node 1 in layer 2 are directly connected without affecting the normal transmission of data in the source address. According to the principle, the switching node 2 and the switching node 3 of the layer 2 are also deleted, and the switching node 3 of the layer 1 is directly connected with the switching node 0 of the layer 3; switching node 2 of layer 1 and switching node 1 of layer 3 are connected directly.
(3) When S > 1, modify switching node 0 to switching node 2^ (S-1) -1 from the 2x2 node to the 2x1 node or the 1x2 node.
Alternatively, this step may be performed independently of the above (1) and (2), or the result of this step may be satisfied if the above (1) and (2) are performed.
After the above evolution, the S layer of the resulting transmission network satisfies the following conditions:
(1) there are no switching nodes at layer S from location 2^ (S-1) +1 to location 2^ S.
(2) When at least one switching node exists at the 1 st to 2 < Lambda > S < th > locations in the layer S, each of the at least one switching node does not include an uplink line.
With reference to fig. 4, 6(a) and 6(b), in the second transmission sub-network of the transmission network, the switching node 1 on layer 1, the switching node 2 on layer 2 and the switching node 3 are deleted after the above evolution.
Specifically, the transmission network in fig. 6(b) is configured to transmit data in 8 source addresses to 4 destination addresses, and a second transmission sub-network in the transmission network is configured to transmit data in source addresses 4 to 7, where the second transmission sub-network includes 4 layers, i.e., layer 0, layer 1, layer 2, and layer 3, and the layer 0 includes 4 switching nodes, i.e., switching node 0, switching node 1, switching node 2, and switching node 3. Layer 1 includes 3 switching nodes, node 0, node 2, and node 3. Layer 2 includes 2 switching nodes, node 0 and node 1. Layer 3 includes 4 switching nodes. The connection mode of each switching node in each layer can refer to fig. 6(b), and is not described one by one here.
In the embodiment of the present application, the source address and the destination address satisfy the second preset relationship, so that when data in the second half of the source address is transmitted through the second transmission sub-network shown in fig. 6(b), no collision occurs, and meanwhile, compared with a conventional transmission network that does not collide, such as a Crossbar network, the number of switching nodes of the transmission network is significantly reduced, and the complexity of the transmission network is significantly reduced.
In this embodiment, a transport network for transmitting data between a source address and a destination address is proposed based on a second predetermined relationship satisfied between the source address and the destination address, in a second transport sub-network of the transport network, no switching node exists from a 2^ (S-1) +1 position to a 2^ S position of a layer S, and when at least one switching node exists from the 1 st position to the 2^ S position in the layer S, each of the at least one switching node does not include an uplink connection line. When data is transmitted through the transmission network, collision is not generated. Meanwhile, compared with the traditional network which does not collide, the number of the switching nodes of the transmission network is obviously reduced, and the complexity of the transmission network is obviously reduced. Therefore, the transmission network has the advantages of high transmission speed and less occupied transmission resources. When the transmission network is used for transmitting sparse data, the transmission overhead and the calculation overhead can be greatly reduced, and the processing efficiency of the sparse data is greatly improved.
In a specific implementation, the transmission network may use the structure as shown in fig. 5(b), i.e. only the first transmission sub-network uses the evolved network structure, or the transmission network may use the structure as shown in fig. 6(b), i.e. only the second transmission sub-network uses the evolved network structure. Alternatively, the transmission network may also use the structure shown in fig. 7, where fig. 7 is a schematic diagram of a network structure obtained by simultaneously using the two-part sub-network evolution method shown in the foregoing, and in fig. 7, the structure of the first transmission sub-network is the same as that in fig. 5(b), and the structure of the second transmission sub-network is the same as that in fig. 6(b), which is not described herein again.
The following table 1 is an example comparing the above-described fig. 5(b) and the above-described fig. 7 with a conventional transmission network. As shown in table 1, compared to the conventional Crossbar network, the number of 2 × 2 switching nodes and connection lines is greatly reduced in fig. 5(b) and fig. 7, and meanwhile, compared to the conventional butterfly network, the collision phenomenon can be avoided.
TABLE 1
Figure PCTCN2019099262-APPB-000001
A specific procedure when data transmission is performed in step S202 based on the above-described transmission network will be described below.
Fig. 8 is a schematic flow chart of a data transmission method according to an embodiment of the present application, and as shown in fig. 8, a process of transmitting data to be transmitted to a target address by using the first transmission sub-network includes:
s801, acquiring a target address corresponding to the data to be transmitted stored in the 1 st source address to the N/2 nd source address, wherein the target address is represented by binary number values.
Optionally, the destination address of the data to be transmitted may be obtained according to a preset correspondence between the data number to be transmitted and the destination address. For example, assuming that 8 data are stored in 8 source addresses, including 2 data to be transmitted, the destination address of the first data to be transmitted is address 0, and the destination address of the second data to be transmitted is address 1.
S802, starting from a Least Significant Bit (LSB) of the target address, determining a transmission path of the data to be transmitted in the transmission network according to a value of each bit in the target address, and transmitting the data to be transmitted to the target address through the transmission path.
Taking the data sequence as a section of feature diagram in the neural network shown in fig. 1, where the transmission network is the transmission network shown in fig. 7 as an example, the section of feature diagram includes 2 valid data, and the 2 valid data are data to be transmitted. This piece of the profile is stored in 8 source addresses as shown in fig. 7, where data 5 is stored in source address 0, and so on, in sequence. As can be seen from the foregoing description, the valid data in this section of the feature map are 5 and 3, where data 3 is stored in source address 5. Thus, data 3 can be transmitted using the second transmission sub-network. Meanwhile, the data 3 may be transmitted to the target address 2 according to the second preset relationship. The binary value of the target address 1 is 001. Data 3 is routed on the second transmission subnetwork, starting with the LSB of 001. Specifically, if LSB of 001 is 1, data 3 is routed from switching node 1 of layer 0 to switching node 1 of layer 2, and is routed directly from switching node 1 of layer 2 to switching node 2 of layer 3, and then transmitted to destination address 2.
In this embodiment, the transmission network is used to route the data to be transmitted to the destination address according to the LSB, so that the data transmission speed can be further increased.
Similar to the process shown in fig. 8, when the second transmission sub-network is used to transmit data, a target address corresponding to data to be transmitted stored in the N/2+1 th source address to the nth source address may be first obtained, the target address is represented by a binary value, and then, starting from the LSB of the target address, a transmission path of the data to be transmitted in the second transmission sub-network is determined according to the value of each bit in the target address, and the data to be transmitted is transmitted to the target address through the transmission path in the second transmission sub-network. The specific implementation process is the same as the processing process of the first transmission subnetwork in fig. 8, and is not described herein again.
In the above embodiments, the number M of destination addresses is smaller than the number N of source addresses. Illustratively, M may be 4 and N may be 8. In this way, if the number of data to be transmitted stored in the source address is greater than M, all the data to be transmitted cannot be transmitted to the destination address at one time for processing. Based on this problem, as an optional implementation manner, if the number of the data to be transmitted is greater than M, the data to be transmitted may be divided into a plurality of groups of sub-data, and the group of sub-data is transmitted to the corresponding destination address using the transmission network under one transmission clock.
Optionally, the data to be transmitted may be divided according to the source address. For example, if the number of the source addresses is 8 and the number of the destination addresses is 4, the data from the source address 0 to the source address 3 is used as the first group of sub-data, and the data from the source address 4 to the source address 7 is used as the second group of sub-data. And then, transmitting the data to be transmitted in the first group of subdata to a target address for operation through the transmission network under one clock, and transmitting the data to be transmitted in the second group of subdata to the target address for operation through the transmission network under the other clock.
In this embodiment, when the number of the data to be transmitted is greater than the number of the target addresses, the data to be transmitted is divided into a plurality of groups of subdata, and each group of subdata is transmitted under different clocks, so that collision between data transmission and operation is avoided, and correctness of data transmission and operation is ensured.
Fig. 9 is a block structure diagram of a data transmission apparatus according to an embodiment of the present application, where the apparatus may be an electronic device described in the foregoing embodiment, or an apparatus in the electronic device, which can implement the functions in the method according to the embodiment of the present application, for example, the apparatus may be an apparatus in the electronic device or a chip system. As shown in fig. 9, the apparatus includes:
a storage unit 901, a target module 902, a transmission network 903, and a control module 904.
The storage unit 901 has N source addresses, and the destination module 902 has a plurality of destination addresses.
The transmission network 903 is connected to the storage unit 901 and the target module 902, respectively.
The transport network 903 comprises a first transport subnetwork comprising a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2^ (Y-1) +1 position to the 2^ Y position of layer Y, and each of the at least one switching node not comprising an uplink line when at least one switching node is present from the 1 st position to the 2^ Y position in layer Y.
The control module 904 may be connected to the storage unit 901, the target module 902, and the transmission network 903, respectively.
The control module 904 is configured to obtain at least one to-be-transmitted data from the storage unit 901, where the to-be-transmitted data is stored in the N source addresses in a scattered manner, and transmit the to-be-transmitted data stored in the 1 st to N/2 nd source addresses to corresponding destination addresses by using a first transmission subnetwork based on a first preset relationship between the source addresses and the destination addresses, where the first preset relationship includes: when the source address is K, the corresponding destination address is one of 0 to K starting from 0.
In an alternative embodiment, the transport network 903 further comprises a second transport subnetwork.
Said second sub-network of transmissions comprising a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2^ (S-1) +1 position to the 2^ S position of layer S, and each of said at least one switching nodes not comprising an uplink connection line when at least one switching node is present from the 1 st position to the 2^ S position in layer S;
the control module 904 is further configured to transmit the data to be transmitted, stored in the N/2+1 th source address to the nth source address, to the corresponding destination address using a second transmission subnet based on a second preset relationship between the source address and the destination address, where the second preset relationship includes: when the source address is L, the corresponding destination address is one of M-1 to M-1- [ L% (N/2) ] starting from M-1, M is the number of destination addresses, and M is smaller than N.
In an alternative embodiment, the number of layers of the first transmission subnetwork is log2(N) +1, and/or, the number of layers of the second transmission subnetworkIs log2(N)+1。
In an alternative embodiment, the control module 904 is specifically configured to:
acquiring a target address corresponding to data transmission to be transmitted, which is stored in the 1 st source address to the N/2 nd source address, wherein the target address is represented by binary number values; and determining a transmission path of the data to be transmitted in the first transmission sub-network according to the value of each bit in the target address from the LSB of the target address, and transmitting the data to be transmitted to the target address through the transmission path in the first transmission sub-network.
In an alternative embodiment, the control module 904 is specifically configured to:
acquiring a target address corresponding to data to be transmitted, which are stored in the (N/2 + 1) th source address to the Nth source address, wherein the target address is represented by binary number values; and determining a transmission path of the data to be transmitted in the second transmission sub-network according to the value of each bit in the target address from the LSB of the target address, and transmitting the data to be transmitted to the target address through the transmission path in the second transmission sub-network.
As an alternative implementation, the target module 902 may be a calculation module, and the calculation module includes at least M addresses.
When the number of the data to be transmitted is greater than M, the control module 904 is further configured to divide at least one data to be transmitted into a plurality of groups of subdata, where each group of subdata is transmitted under one transmission clock.
The data transmission device provided in the embodiment of the present application may perform the method steps in the foregoing method embodiments, and the implementation principle and the technical effect are similar, which are not described herein again.
It should be noted that the division of the modules of the above apparatus is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can be realized in the form of software called by processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the determining module may be a processing element separately set up, or may be implemented by being integrated in a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and the function of the determining module is called and executed by a processing element of the apparatus. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
For example, the above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when some of the above modules are implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor that can call program code. As another example, these modules may be integrated together, implemented in the form of a system-on-a-chip (SOC).
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), etc.
Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 10, the electronic device 1000 may include: a processor 101 (e.g., CPU), memory 102, transceiver 103; the transceiver 103 is coupled to the processor 101, and the processor 101 controls the transceiving action of the transceiver 103. Various instructions may be stored in memory 102 for performing various processing functions and implementing the method steps performed by the electronic device in the embodiments of the present application. Optionally, the electronic device related to the embodiment of the present application may further include: a power supply 104, a system bus 105, and a communication port 106. The transceiver 103 may be integrated in a transceiver of the electronic device or may be a separate transceiving antenna on the electronic device. The system bus 105 is used to implement communication connections between the elements. The communication port 106 is used for realizing connection and communication between the electronic device and other peripherals.
In the embodiment of the present application, the processor 101 is configured to be coupled with the memory 102, and read and execute the instructions in the memory 102 to implement the method steps performed by the electronic device in the above method embodiment. The implementation principle and the technical effect are similar, and are not described in detail herein.
The system bus mentioned in fig. 10 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The system bus may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus. The communication interface is used for realizing communication between the database access device and other equipment (such as a client, a read-write library and a read-only library). The memory may comprise Random Access Memory (RAM) and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor may be a general-purpose processor, including a central processing unit CPU, a Network Processor (NP), and the like; but also a digital signal processor DSP, an application specific integrated circuit ASIC, a field programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components.
Optionally, an embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the storage medium, and when the instructions are executed on a computer, the instructions cause the computer to perform the processing procedure of the electronic device in the foregoing embodiment.
Optionally, an embodiment of the present application further provides a chip for executing the instruction, where the chip is used to execute a processing procedure of the electronic device in the foregoing embodiment.
The embodiment of the present application further provides a program product, where the program product includes a computer program, the computer program is stored in a storage medium, the computer program can be read from the storage medium by at least one processor, and the at least one processor executes a processing procedure of the electronic device in the embodiment.
In the embodiments of the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula, the character "/" indicates that the preceding and following related objects are in a relationship of "division". "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.
It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for convenience of description and distinction and are not intended to limit the scope of the embodiments of the present application.
It should be understood that, in the embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic, but should not constitute any limitation to the implementation process of the embodiments of the present application

Claims (17)

  1. A method of data transmission, comprising:
    acquiring at least one data to be transmitted from a storage unit, wherein the storage unit is provided with N source addresses, and the data to be transmitted is dispersedly stored in the N source addresses;
    based on a first preset relationship between source addresses and target addresses, using a first transmission sub-network to transmit data to be transmitted, which is stored in the 1 st source address to the N/2 nd source address, to the corresponding target addresses, wherein the first preset relationship comprises: when the source address is K, the corresponding target address is one of 0 to K starting from 0;
    wherein the first transport subnetwork comprises a plurality of layers, each layer comprising at least one switching node, no switching node being present from a 2^ (Y-1) +1 position to a 2^ Y position of layer Y, and each of the at least one switching node not comprising an uplink line when at least one switching node is present from the 1 st position to the 2^ Y position in layer Y.
  2. The method of claim 1, further comprising:
    based on a second preset relationship between the source address and the destination address, using a second transmission sub-network to transmit the data to be transmitted, which is stored in the (N/2 + 1) th source address to the nth source address, to the corresponding destination address, wherein the second preset relationship includes: when the source address is L, the corresponding target address is one of M-1 to M-1- [ L% (N/2) ] starting from M-1, M is the number of the target addresses, and M is less than N;
    the second sub-network of transmissions comprises a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2^ (S-1) +1 position to the 2^ S position of layer S, and each of the at least one switching node not comprising an uplink connection line when at least one switching node is present from the 1 st position to the 2^ S position in layer S.
  3. Method according to claim 2, characterized in that the number of layers of the first transmission subnetwork is log2(N) +1, and/or the number of layers of the second transmission subnetwork is log2(N)+1。
  4. The method according to any of claims 1-3, wherein the transmitting data to be transmitted stored in the 1 st source address to the N/2 nd source address to the corresponding destination address using the first transmission sub-network comprises:
    acquiring a target address corresponding to the transmission of the data to be transmitted stored in the 1 st source address to the N/2 nd source address, wherein the target address is represented by a binary number value;
    and determining a transmission path of the data to be transmitted in the first transmission sub-network from the Least Significant Bit (LSB) of the target address according to the numerical value of each bit in the target address, and transmitting the data to be transmitted to the target address through the transmission path in the first transmission sub-network.
  5. The method according to claim 2 or 3, wherein the transmitting the data to be transmitted stored in the (N/2 + 1) th source address to the Nth source address to the corresponding destination address by using the second transmission sub-network comprises:
    acquiring a target address corresponding to the data to be transmitted, which is stored in the (N/2 + 1) th source address to the Nth source address, wherein the target address is represented by binary number values;
    and determining a transmission path of the data to be transmitted in the second transmission sub-network according to the numerical value of each bit in the target address from the LSB of the target address, and transmitting the data to be transmitted to the target address through the transmission path in the second transmission sub-network.
  6. The method according to any of claims 1-5, wherein the target address is an address in a computation module, the computation module comprising at least M addresses.
  7. The method of claim 6, wherein the transmitting the data to be transmitted stored in the 1 st source address to the N/2 th source address to the corresponding destination address using the first transmission sub-network based on the first predetermined relationship between the source address and the destination address further comprises:
    and if the quantity of the data to be transmitted is greater than M, dividing the at least one data to be transmitted into a plurality of groups of subdata, and transmitting each group of subdata under one transmission clock.
  8. A data transmission apparatus, comprising: the system comprises a storage unit, a target module, a transmission network and a control module;
    n source addresses are arranged in the storage unit;
    a plurality of target addresses are set in the target module;
    the transmission network is respectively connected with the storage unit and the target module;
    the transport network comprises a first transport subnetwork comprising a plurality of layers, each layer comprising at least one switching node, no switching node being present from a 2^ (Y-1) +1 position to a 2^ Y position of layer Y, and each of the at least one switching node not comprising an uplink line when at least one switching node is present from a 1 st position to a 2^ Y position in layer Y;
    the control module is configured to obtain at least one to-be-transmitted data from the storage unit, where the to-be-transmitted data is stored in the N source addresses in a dispersed manner, and transmit the to-be-transmitted data stored in the 1 st to N/2 nd source addresses to corresponding destination addresses using the first transmission subnetwork based on a first preset relationship between the source addresses and the destination addresses, where the first preset relationship includes: when the source address is K, the corresponding destination address is one of 0 to K starting from 0.
  9. The apparatus of claim 8, wherein the transport network further comprises a second transport subnetwork;
    the second sub-network of transmissions comprises a plurality of layers, each layer comprising at least one switching node, no switching node being present from the 2^ (S-1) +1 position to the 2^ S position of layer S, and each of the at least one switching node not comprising an uplink line when at least one switching node is present from the 1 st position to the 2^ S position in layer S;
    the control module is further configured to transmit data to be transmitted, which is stored in the (N/2 + 1) th source address to the nth source address, to a corresponding destination address using a second transmission subnetwork based on a second preset relationship between the source address and the destination address, where the second preset relationship includes: when the source address is L, the corresponding destination address is one of M-1 to M-1- [ L% (N/2) ] starting from M-1, M is the number of destination addresses, and M is smaller than N.
  10. The apparatus of claim 9, wherein the number of layers of the first transmission subnetwork is log2(N)+1,And/or the number of layers of the second transmission subnetwork is log2(N)+1。
  11. The apparatus according to any one of claims 8-10, wherein the control module is specifically configured to:
    acquiring a target address corresponding to the transmission of the data to be transmitted stored in the 1 st source address to the N/2 nd source address, wherein the target address is represented by a binary number value; and the number of the first and second groups,
    and determining a transmission path of the data to be transmitted in the first transmission sub-network from the Least Significant Bit (LSB) of the target address according to the numerical value of each bit in the target address, and transmitting the data to be transmitted to the target address through the transmission path in the first transmission sub-network.
  12. The apparatus according to claim 9 or 10, wherein the control module is specifically configured to:
    acquiring a target address corresponding to the data to be transmitted, which is stored in the (N/2 + 1) th source address to the Nth source address, wherein the target address is represented by binary number values; and the number of the first and second groups,
    and determining a transmission path of the data to be transmitted in the second transmission sub-network according to the numerical value of each bit in the target address from the LSB of the target address, and transmitting the data to be transmitted to the target address through the transmission path in the second transmission sub-network.
  13. The apparatus according to any one of claims 8-12, wherein the target module is a calculation module, and wherein the calculation module comprises at least M addresses.
  14. The apparatus of claim 13, wherein the control module is further configured to:
    and when the number of the data to be transmitted is more than M, dividing the at least one data to be transmitted into a plurality of groups of subdata, and transmitting each group of subdata under one transmission clock.
  15. An electronic device, comprising: a memory and a processor;
    the processor is coupled to the memory, and reads and executes instructions in the memory to implement the method steps of any one of claims 1-7.
  16. A computer program product, characterized in that the computer program product comprises computer program code which, when executed by a computer, causes the computer to perform the method of any of claims 1-7.
  17. A computer-readable storage medium having stored thereon computer instructions which, when executed by a computer, cause the computer to perform the method of any of claims 1-7.
CN201980098672.1A 2019-08-05 2019-08-05 Data transmission method and device, electronic equipment and readable storage medium Pending CN114144793A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/099262 WO2021022441A1 (en) 2019-08-05 2019-08-05 Data transmission method and device, electronic device and readable storage medium

Publications (1)

Publication Number Publication Date
CN114144793A true CN114144793A (en) 2022-03-04

Family

ID=74502548

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980098672.1A Pending CN114144793A (en) 2019-08-05 2019-08-05 Data transmission method and device, electronic equipment and readable storage medium

Country Status (2)

Country Link
CN (1) CN114144793A (en)
WO (1) WO2021022441A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3358456A4 (en) * 2016-12-05 2018-08-08 Huawei Technologies Co., Ltd. Control method, storage device and system for data read/write command in nvme over fabric architecture
CN109214543B (en) * 2017-06-30 2021-03-30 华为技术有限公司 Data processing method and device
CN107608715B (en) * 2017-07-20 2020-07-03 上海寒武纪信息科技有限公司 Apparatus and method for performing artificial neural network forward operations
US11468332B2 (en) * 2017-11-13 2022-10-11 Raytheon Company Deep neural network processor with interleaved backpropagation
CN109165728B (en) * 2018-08-06 2020-12-18 浪潮集团有限公司 Basic computing unit and computing method of convolutional neural network

Also Published As

Publication number Publication date
WO2021022441A1 (en) 2021-02-11

Similar Documents

Publication Publication Date Title
CN110210610B (en) Convolution calculation accelerator, convolution calculation method and convolution calculation device
TW202147188A (en) Method of training neural network model and related product
CN111475250B (en) Network optimization method and device in cloud environment
CN112800386B (en) Fourier transform processing method, processor, terminal, chip and storage medium
CN105740405A (en) Data storage method and device
CN109729731B (en) Accelerated processing method and device
CN111046004B (en) Data file storage method, device, equipment and storage medium
CN114144793A (en) Data transmission method and device, electronic equipment and readable storage medium
KR102238600B1 (en) Scheduler computing device, data node of distributed computing system having the same, and method thereof
CN108446177B (en) Task processing method, computer readable storage medium and terminal device
CN111027688A (en) Neural network calculator generation method and device based on FPGA
CN113254072B (en) Data processor, data processing method, chip, computer device, and medium
US10193757B2 (en) Network topology system and method
CN115346099A (en) Image convolution method, chip, equipment and medium based on accelerator chip
CN110543664B (en) Process mapping method for FPGA with special structure
US20200057638A1 (en) Linear feedback shift register for a reconfigurable logic unit
CN112073505A (en) Method for unloading on cloud server, control device and storage medium
CN114095289B (en) Data multicast circuit, method, electronic device, and computer-readable storage medium
WO2023173912A1 (en) Configuration method for processing element (pe) array and related device
CN113570049B (en) Relative addressing method, device, equipment and medium for interconnection of multiple SNN chips
CN114827016B (en) Method, device, equipment and storage medium for switching link aggregation scheme
CN114186186B (en) Matrix calculation method and related equipment
CN115955429B (en) Network-on-chip routing method, device, system and electronic equipment
CN114006813B (en) Dynamic generation method and system for virtual private line distribution route
CN112015610B (en) Test sequence generation method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination