CN113051212B - Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium - Google Patents

Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113051212B
CN113051212B CN202110229587.3A CN202110229587A CN113051212B CN 113051212 B CN113051212 B CN 113051212B CN 202110229587 A CN202110229587 A CN 202110229587A CN 113051212 B CN113051212 B CN 113051212B
Authority
CN
China
Prior art keywords
data
data packet
packet
graphics processor
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110229587.3A
Other languages
Chinese (zh)
Other versions
CN113051212A (en
Inventor
龙斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha Jingmei Integrated Circuit Design Co ltd
Changsha Jingjia Microelectronics Co ltd
Original Assignee
Changsha Jingmei Integrated Circuit Design Co ltd
Changsha Jingjia Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha Jingmei Integrated Circuit Design Co ltd, Changsha Jingjia Microelectronics Co ltd filed Critical Changsha Jingmei Integrated Circuit Design Co ltd
Priority to CN202110229587.3A priority Critical patent/CN113051212B/en
Publication of CN113051212A publication Critical patent/CN113051212A/en
Application granted granted Critical
Publication of CN113051212B publication Critical patent/CN113051212B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/17Interprocessor communication using an input/output type connection, e.g. channel, I/O port
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the application provides a graphic processor and a data transmission method, wherein the graphic processor comprises the following components: the connection module is used for acquiring first data to be sent from the on-chip interconnection bus and generating a first data packet, or is used for analyzing a received second data packet to obtain second data and then sending the second data to the on-chip interconnection bus; the sending switching module is used for forwarding the generated first data packet to the sender; a transmitter for transmitting the generated first data packet to a receiver of another graphic processor; a receiver for receiving a second data packet transmitted from a transmitter of another graphic processor; and the receiving switching module is used for forwarding the second data packet received by the receiver to the connecting module. The connecting module is designed in the GPU, and the GPU can be directly interconnected through the receiver and the transmitter through the connecting module, so that an additional interface is not needed, and the difficulty of interconnection between the GPUs can be effectively reduced.

Description

Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer image processing technology, and in particular, to a graphics processor, a data transmission method, a data transmission device, an electronic apparatus, and a storage medium.
Background
Graphics processors (Graphics Processing Unit, GPUs) are microprocessors that do image and graphics related operations specifically on personal computers, workstations, gaming machines and some mobile devices. In image processing by GPUs, multiple GPUs are typically used to cooperate with each other to perform a portion of the image processing task.
The GPUs cooperate with each other to require data communication between the GPUs, and a dedicated high-speed serial interconnection bus is generally used to implement data communication. However, using a dedicated high-speed interconnect serial interconnect bus requires an additional interface, which presents difficulties for the interconnection between GPUs.
Disclosure of Invention
The embodiment of the application provides a graphics processor and a data transmission method, which can effectively solve the problem of difficult interconnection between GPUs.
According to a first aspect of an embodiment of the present application, there is provided a graphic processor including: the device comprises a connection module, a transmitting and switching module, a transmitter, a receiving and switching module and a receiver; the connection module is used for acquiring first data to be sent from the on-chip interconnection bus and generating a first data packet, or is used for analyzing the received second data packet to obtain second data and then sending the second data to the on-chip interconnection bus, and is used for acquiring the first data to be sent from the on-chip interconnection bus and generating the first data packet, or is used for analyzing the received second data packet to obtain second data and then sending the second data to the on-chip interconnection bus; the sending switching module is used for forwarding the generated first data packet to the sender; the transmitter is configured to transmit the generated first data packet to a receiver of another graphics processor; the receiver is used for receiving a second data packet sent by the sender of another graphic processor; the receiving switching module is configured to forward the second data packet received by the receiver to the connection module.
According to a second aspect of the embodiment of the present application, there is provided a data transmission method applied to the graphics processor provided in the first aspect, the method including: acquiring first data, wherein the first data is data which needs to be sent to another graphic processor; packaging the first data to obtain a first data packet; and transmitting the first data packet to a receiver of another connected graphics processor through a transmitter.
According to a third aspect of an embodiment of the present application, there is provided a data transmission apparatus applied to the graphics processor provided in the first aspect, the apparatus including: the receiving module is used for acquiring first data, wherein the first data is data which needs to be sent to another graphic processor; the processing module is used for packaging the first data to obtain a first data packet; and the transmission module is used for transmitting the first data packet to a receiver of another connected graphics processor through a transmitter.
According to a fourth aspect of embodiments of the present application, there is provided an electronic device comprising at least two of the graphics processors provided in the first aspect, the graphics processors being configured to perform the data transmission method provided in the second aspect.
According to a fifth aspect of embodiments of the present application, an embodiment of the present application provides a computer readable storage medium having program code stored therein, wherein the above-described method is performed when the program code is run.
The graphic processor provided by the embodiment of the application comprises a connection module, a sending switching module, a sender, a receiving switching module and a receiver; the connection module is used for acquiring first data to be sent from the on-chip interconnection bus and generating a first data packet, or is used for analyzing a received second data packet to obtain second data and then sending the second data to the on-chip interconnection bus; the sending switching module is used for forwarding the generated first data packet to the sender; the transmitter is configured to transmit the generated first data packet to a receiver of another graphics processor; the receiver is used for receiving a second data packet sent by the sender of another graphic processor; the receiving switching module is configured to forward the second data packet received by the receiver to the connection module. The connecting module is designed in the GPU, and the GPU can be directly interconnected through the receiver and the transmitter through the connecting module, so that an additional interface is not needed, and the difficulty of interconnection between the GPUs can be effectively reduced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a schematic diagram of an implementation of graphics processor interconnect provided by an embodiment of the present application;
FIG. 2 is a block diagram of a graphics processor according to one embodiment of the present application;
FIG. 3 is a schematic diagram of a connection module according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an organization of a data packet according to an embodiment of the present application;
fig. 5 is a flowchart of a data transmission method according to an embodiment of the present application;
fig. 6 is a flowchart of a data transmission method according to another embodiment of the present application;
FIG. 7 is a functional block diagram of a data transmission device according to an embodiment of the present application;
fig. 8 is a block diagram of an electronic device for performing a data transmission method according to an embodiment of the present application.
Detailed Description
Graphics processors (Graphics Processing Unit, GPUs) are microprocessors that do image and graphics related operations specifically on personal computers, workstations, gaming machines and some mobile devices. In image processing by GPUs, multiple GPUs are typically used to cooperate with each other to perform a portion of the image processing task.
The GPUs cooperate with each other to require data communication between the GPUs, and a dedicated high-speed serial interconnection bus is generally used to implement data communication. However, using a dedicated high-speed interconnect serial interconnect bus requires an additional interface, which presents difficulties for the interconnection between GPUs.
The inventors have found in research that in many GPU designs, hdmi_tx (transmitter) and hdmi_rx (receiver) are present at the same time. For some board applications, when hdmi_tx and hdmi_rx are not used at the same time, multiple pieces of hdmi_tx and hdmi_rx may be connected to implement interconnection of multiple pieces of GPUs to form a dedicated data channel.
Referring to fig. 1, a schematic diagram of implementing GPU interconnection according to an embodiment of the present application is shown.
In fig. 1, there are two GPUs, GPU (a) and GPU (B), respectively, wherein GPU (a) and GPU (B) are connected with pcie_swich through PCIE bus, and pcie_switc is connected with upper device, i.e. central processor (central processing unit, CPU) through PCIE bus. At this time, under the control of the host, the GPU (a) may send data to the hdmi_rx interface of the GPU (B) through the hdmit_tx, and the GPU (B) may complete data reception, and at the same time, the GPU (B) may also send data to the hdmi_rx interface of the GPU (a) through its hdmi_tx.
In this data mode, hdmi_tx and hdmi_rx are main lines of data transmission, and trans_lanex3 represents 3 sets of transmission lines corresponding to RGB chrominance components. The GPIO connection line is used as a data flow control management signal, i.e. the receiving terminal in fig. 1, and is returned to the transmitting device by the receiving device as an enable signal of the current data transmission. The inventor finds that a connection module can be arranged in the GPUs, and the connection module can be used for directly realizing interconnection between the GPUs, so that the interconnection between the GPUs can be realized without an additional interface.
Therefore, in an embodiment of the present application, there is provided a graphics processor, where the graphics processor includes a connection module, a transmission switching module, a transmitter, a reception switching module, and a receiver; the connection module is used for acquiring first data to be sent from the on-chip interconnection bus and generating a first data packet, or is used for analyzing a received second data packet to obtain second data and then sending the second data to the on-chip interconnection bus; the sending switching module is used for forwarding the generated first data packet to the sender; the transmitter is configured to transmit the generated first data packet to a receiver of another graphics processor; the receiver is used for receiving a second data packet sent by the sender of another graphic processor; the receiving switching module is configured to forward the second data packet received by the receiver to the connection module. The connecting module is designed in the GPU, and the GPU can be directly interconnected through the receiver and the transmitter through the connecting module, so that an additional interface is not needed, and the difficulty of interconnection between the GPUs can be effectively reduced.
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of exemplary embodiments of the present application is provided in conjunction with the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application and not exhaustive of all embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
Referring to FIG. 2, a diagram of a graphics processor is shown, according to one embodiment of the present application. Graphics processor (Graphics Processing Unit, GPU) 100 includes connection module 10, transmit switch module 20, transmitter 30, receive switch module 40, and receiver 50. The connection module 10 is configured to generate a data packet or parse a received data packet; the transmission switching module 20 is configured to forward the generated data packet to the transmitter 30; the receiving and switching module 40 is configured to forward the data packet received by the receiver to the connection module 10.
It can be understood that after the GPU is connected with the GPU, complex drawing and calculation tasks can be completed by the system, and each GPU can be used as a data transmitting end or a data receiving end.
When the GPU is used as the transmitting end, the connection module 10 may acquire the data to be transmitted, i.e. the first data, through an on-chip interconnect bus, where the on-chip interconnect bus refers to an on-chip interconnect bus, for example, an advanced extensible interface (Advanced eXtensible Interface, AXI) interconnect bus, an advanced high-performance bus (Advanced High Performance Bus, AHB), a peripheral bus (Advanced Peripheral Bus, APB), and may also be other custom buses. The on-chip interconnection bus may be set according to actual needs, which is not limited herein, and in the embodiment of the present application, only the on-chip interconnection bus is taken as an AXI interconnection bus for example to be described in detail. That is, the connection module 10 may obtain data to be transmitted from the AXI interconnection bus. After the connection module 10 obtains the data to be sent, the data to be sent to another graphics processor may be packaged. That is, the first data to be transmitted may be acquired, and the first data packet may be generated according to the first data. The connection module 10 sends the generated first data packet to the transmission switching module 20, and when the transmission switching module 20 receives the first data packet, the first data packet is sent to the sender 30, so that the sender 30 can send the first data packet to the receivers of other GPUs.
The preset data packet format is preset, and the connection module 10 may package the first data according to the preset data packet format to obtain a first data packet with the preset data packet format. The preset data packet format comprises a packet header field, a packet load field and a packet tail field. That is, the connection module 10 may encapsulate the first data according to a packet header field, a packet payload field, and a packet trailer field, to obtain the first data packet.
When the GPU 100 is used as the receiving end, the receiver 50 may receive a data packet sent by another GPU, and define the data packet as the second data packet. After the receiver 50 receives the second data packet, the second data packet may be sent to the receiving and forwarding module 40, and the receiving and forwarding module 40 sends the second data packet to the on-chip interconnection bus. The connection module 10 may parse the second data packet to obtain the real transmission data, i.e. the second data, and the destination address, and the connection module 10 sends the second data to the destination address through the on-chip interconnection bus.
When the connection module 10 generates the first data packet, the first data may be packaged according to a preset data packet format, and the obtained first data packet includes information such as a destination address, a data length, and load data. It may be appreciated that the format of the second data packet received by the connection module 10 is also the preset data packet format, so that the connection module 10 may parse the second data packet to obtain a destination address, so that the second data parsed from the second data packet may be sent to a location corresponding to the destination address through an on-chip interconnection bus.
Referring to fig. 3, a schematic diagram of a connection module according to an embodiment of the application is shown. The connection module 10 includes an AXI interface 11, a transmission data buffering unit 12, a transmission data encoding and packaging unit 13, a timing generation unit 14, a reception data buffering unit 15, and a reception data decoding and unpacking unit 16. The AXI interface 11 is connected to the transmit data buffer unit 12 and the receive data buffer unit 15, the transmit data buffer unit 12 is connected to the transmit data encoding and packaging unit 13, the receive data buffer unit 15 is connected to the receive data decoding and unpacking unit 16, and the timing sequence generating unit 14 is connected to the transmit data encoding and packaging unit 13. The on-chip interconnect bus is an AXI interconnect bus.
The AXI interface 11 obtains data to be sent to another GPU, i.e. first data to be sent, from the AXI interconnection bus. When the GPU is used as the transmitting end, the AXI interface is also used as the transmitting end, and may acquire a data start address from an AXI interconnection bus, and automatically initiate AXI data reading access according to the data start address, so as to acquire the first data. The first data acquired by the AXI interface 11 is 128-bit data. After the AXI interface 11 acquires the first data, the first data may be temporarily stored in the transmit data buffer unit 12. The transmission data buffer unit 12 may send the first data to the transmission data encoding and packaging unit 13, where the first data acquired by the transmission data encoding and packaging unit 13 is 128-bit data. The transmission data encoding and packaging unit 13 may package the first data in 128-bit transmission units.
The transmit data encoding and packaging unit 13 may perform packaging according to a predetermined data packet format when packaging the first data. The preset data packet format comprises a packet header field, a packet load field and a packet tail field.
Since the receiver and the transmitter are audio-video transmission interfaces, typically 3 sets of transmission lines, correspond to the chrominance components of RGB. Typically one byte, i.e. 8 bits, is transmitted per single cycle per set of transmission lines, that is, the receiver and transmitter are allowed to transmit 24 bits of data per cycle. In order to implement the preset data packet format, the acquired first data of 128 bits may be packaged in batches with 24 bits as a reference, where the two highest bits of the 24 bits of data are identified as data packets. Specifically, the header field is 00, the packet payload data field is 01, the trailer field is 10, and the invalid field is 11. Specifically, referring to fig. 4, a schematic diagram of the organization of a data packet is shown.
The header field, i.e., the Head in fig. 4, employs 3 24-bit data organization, including destination address Add [35:0], packet length burst [7:0], and start data byte mask FirstBE [15:0]. The initial data byte mask FirstBE is the byte mask of the first data in the packet transmission process, and can be understood as the byte address start.
The packet PayLoad data field, i.e. PayLoad in fig. 4, is composed of a plurality of 128-bit data, and the number of data corresponds to the packet length burst in the packet header field, and is organized and transmitted in 6 consecutive 24-bit data.
The Tail field, tail in FIG. 4, indicates the end of the current packet transfer, where the byte mask, i.e., byte address, that includes the last data ends. The organization of the data packet may also include an Invalid field, i.e., invalid in FIG. 4.
It should be noted that, the reserved fields exist in each field, and can be used for implementing check codes and ECC redundancy error correction.
The data encoding and packaging unit 13 may package the first data according to the above-mentioned data packet organization structure to obtain the first data packet, and send the first data packet in a video standard format. Specifically, when the first packet is transmitted, the timing of the acquisition timing generation unit 14 may be used to transmit the first packet. The timing generation unit 14 may generate a line synchronization signal HS, a field synchronization signal VS, and an active display data strobe signal DE, and the data encoding and packaging unit 13 may transmit the first data packet in a video data active interval. That is, the first data packet may be transmitted during the period in which the active display data strobe signal DE is high.
When the GPU is a receiving end, the AXI interface 11 may also be a receiving end. The received data decoding and unpacking unit 16 may receive a data packet, that is, a second data packet, sent by the received data transfer module in the GPU, parse the second data packet, and separate information such as a destination address, a data length, and load data from the second data packet. After parsing the second data packet, the parsed data, i.e. the second data, is temporarily stored in the received data buffer unit 15. The AXI interface 11 may read, from the received data buffer unit 15, the second data obtained by parsing the second data packet, and since the second data obtained by parsing the second data packet includes the destination address, the second data may be sent to the location specified by the destination address.
It should be noted that, the AXI interface 11 may support both read and write bidirectional transmission of data, that is, may receive data and send data simultaneously, and the data shorthand supports single transmission and continuous data block moving transmission. When transmitting data, the AXI interface 11, the transmission data buffer unit 12, the transmission data encoding and packaging unit 13, and the timing generation unit 14 cooperate. When receiving data, the received data decoding and unpacking unit 16, the received data buffer unit 15 and the AXI interface 11 cooperate.
The embodiment of the application provides a graphic processor which comprises a connection module, a sending switching module, a sender, a receiving switching module and a receiver; the connection module is used for generating a first data packet or analyzing the received second data packet; the sending switching module is used for forwarding the generated first data packet to the sender; the transmitter is configured to transmit the generated first data packet to a receiver of another graphics processor; the receiver is used for receiving a second data packet sent by the sender of another graphic processor; the receiving switching module is configured to forward the second data packet received by the receiver to the connection module. The connecting module is designed in the GPU, and the GPU can be directly interconnected through the receiver and the transmitter through the connecting module, so that an additional interface is not needed, and the difficulty of interconnection between the GPUs can be effectively reduced.
Referring to fig. 5, an embodiment of the present application provides a data transmission method, which can be applied to the graphics processor described in the foregoing embodiment, and specifically, the method may include the following steps.
Step 110, obtain first data, the first data is the data that needs to send to another graphic processor.
When the GPU is connected with another GPU, data transmission can be performed between the two GPUs. When the graphics processor is used as a data transmitting end, the graphics processor can acquire data which needs to be transmitted to another graphics processor, namely, first data. As in the previous embodiment, the connection module in the graphics processor may obtain the first data from an on-chip interconnect bus.
And step 120, packaging the first data to obtain a first data packet.
And when the GPU acquires the first data, the first data can be packaged to obtain a first data packet. When the first data is packaged, a preset data packet format is acquired, and the first data is packaged according to the preset data packet format.
The preset data packet format comprises a packet header field, a packet load data field and a packet tail field, wherein the packet header field adopts 3 24-bit data organization, the packet load data field adopts 6 continuous 24-bit data organization, and the packet tail field comprises 1 24-bit data organization. Specifically, the data structure of the data packet may refer to fig. 3 or the corresponding parts of the foregoing embodiments, and in order to avoid repetition, the description is omitted here. And the GPU packages the first data according to the preset data packet format to obtain a first data packet.
And 130, transmitting the first data packet to a receiver of another connected graphics processor through a transmitter.
After the GPU obtains the first data packet, the first data packet can be sent to another connected GPU through a sender, so that data transmission between the GPUs is realized.
The GPU may send the data packet in a video standard format, and the timing of sending the first data packet may be within an effective interval of a video timing sequence. Inside the GPU, timing signals may be generated, which may be a row synchronization signal HS, a field synchronization signal VS, and an active display data strobe signal DE. The GPU may be a receiver that sends the first data packet in a video standard format to another GPU connected when the active display data strobe signal DE is high.
Steps 110 to 130 describe the data transmission process when the GPU is used as the data transmitting end. It can be appreciated that the GPU may be used as a data transmitting end or a data receiving end during the data transmission process. Referring to fig. 6, another data transmission method is provided in the embodiment of the present application, which can be applied to the graphics processor described in the previous embodiment, and specifically, the method may include the following steps.
Step 210, receiving a second data packet sent by a sender of another graphics processor connected thereto.
When the GPU is connected with the GPU to transmit data, if the other GPU needs to send the data to the current GPU, the current GPU is a data receiving end. In the foregoing embodiments, the GPU sends the first data packet to another GPU in a video standard format. Thus, the GPU may receive a first data packet sent by another GPU. And defining the received data packet as a second data packet when the GPU is a receiving end.
And 220, analyzing the second data packet to obtain the second data and the destination address.
After the GPU receives the second data packet, the second data packet may be parsed to obtain a destination address of the second data packet and second data. Because the second data packet has a fixed data format, the GPU may parse out the header field, the payload data field, and the trailer field in the second data packet. The highest two bits in each field are identified as data packets, wherein the packet header field is 00, the packet payload data field is 01, the packet trailer field is 10, and the invalid field is 11. Thus, the fields included in the second data packet can be parsed by the top two bits in each field.
And step 230, transmitting the second data to the position indicated by the destination address.
The packet header field includes a destination address, and the packet payload data field includes data that actually needs to be transmitted, i.e., the second data. The GPU may parse out the destination address in the header field and the second data in the packet payload data field. Thus, the GPU may transmit the second data to the location indicated by the destination address.
It should be noted that, the GPU may perform data transmission and data reception simultaneously.
The process of data transfer between GPUs will be described in one specific example. The method comprises the steps that two GPUs are respectively a first GPU and a second GPU, and data transmission is needed between the first GPU and the second GPU. Suppose a first GPU needs to send first data, data C, to a second GPU for storage.
At this time, the first GPU may obtain the data C, package the data C according to a preset data packet format, obtain a data packet D, and send the data packet D to the second GPU in a video standard format. For the second GPU, the second GPU may receive the data packet D sent by the first GPU in the video standard format, and when receiving the data packet D, may parse the data packet D to obtain the data C and the destination address of the data C. So that the second GPU may transfer the data C to the destination address for storage.
The data transmission method provided by the embodiment of the application is applied to a graphic processor provided with a connection module, wherein the graphic processor acquires first data, and the first data is data which needs to be sent to another graphic processor; packaging the first data to obtain a first data packet; and transmitting the first data packet to a receiver of another connected graphics processor through a transmitter. Through designing the connection module in the graphic processor, the interconnection and data transmission between the GPUs through the receiver and the transmitter can be directly realized, no extra interface is needed, and the difficulty of interconnection between the GPUs can be effectively reduced.
Referring to fig. 7, an embodiment of the present application provides a data transmission device 400, which is applicable to a graphics processor, wherein the data transmission device 400 includes a receiving module 410, a processing module 420 and a transmission module 430. The receiving module 410 is configured to obtain first data, where the first data is data that needs to be sent to another graphics processor; the processing module 420 is configured to package the first data to obtain a first data packet; the transmission module 430 is configured to send the first data packet to a receiver of another graphics processor connected to the first data packet through a transmitter.
Further, the receiving module 410 is further configured to receive a second data packet sent by a sender of the connected other graphics processor; the processing module 420 is further configured to parse the second data packet to obtain second data and a destination address; the transmission module 430 is further configured to transmit the second data to a location indicated by the destination address.
Further, the processing module 420 is further configured to obtain a preset data packet format; and packaging the first data according to the preset data packet format.
Further, the preset data packet format includes a packet header field, a packet payload data field and a packet tail field, wherein the packet header field adopts 3 24-bit data organization, the packet payload data field adopts 6 continuous 24-bit data organization, and the packet tail field includes 1 24-bit data organization.
The data transmission device provided by the embodiment of the application is applied to a graphic processor provided with a connection module, wherein the graphic processor acquires first data, and the first data is data which needs to be sent to another graphic processor; packaging the first data to obtain a first data packet; and transmitting the first data packet to a receiver of another connected graphics processor through a transmitter. Through designing the connection module in the graphic processor, the interconnection and data transmission between the GPUs through the receiver and the transmitter can be directly realized, no extra interface is needed, and the difficulty of interconnection between the GPUs can be effectively reduced.
It should be noted that, for convenience and brevity of description, specific working processes of the apparatus described above may refer to corresponding processes in the foregoing method embodiments, which are not repeated herein.
Referring to fig. 8, an embodiment of the present application provides a block diagram of an electronic device 500, where at least two graphics processors 510 are used to perform the above-mentioned data transmission method by the at least two graphics processors 510.
The electronic device 500 may be a terminal device capable of running an application program, such as a tablet computer, a notebook computer, or the like. The graphic processor 510 may be implemented in at least one hardware form of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). A central processing unit (Central Processing Unit, CPU) 510 may also be integrated into the electronic device 500. The CPU mainly processes an operating system, a user interface, an application program and the like; graphics processor 510 is responsible for rendering and drawing of display content.
The electronic equipment provided by the embodiment of the application comprises at least two graphic processors, wherein the graphic processors acquire first data, and the first data is data which needs to be sent to another graphic processor; packaging the first data to obtain a first data packet; and transmitting the first data packet to a receiver of another connected graphics processor through a transmitter. The graphic processor is provided with the connection module, so that the interconnection and data transmission between the GPUs can be directly realized through the receiver and the transmitter, no extra interface is needed, and the difficulty of interconnection between the GPUs can be effectively reduced.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (11)

1. A graphics processor, characterized in that the graphics processor comprises a connection module, a transmission switching module, a transmitter, a reception switching module and a receiver;
the connection module is used for acquiring first data to be sent from the on-chip interconnection bus and generating a first data packet, or is used for analyzing a received second data packet to obtain second data and then sending the second data to the on-chip interconnection bus;
the sending switching module is used for forwarding the generated first data packet to the sender;
the transmitter is used for transmitting the first data packet to a receiver of another graphic processor;
the receiver is used for receiving a second data packet sent by the sender of another graphic processor;
the receiving switching module is used for forwarding the second data packet to the connecting module;
the on-chip interconnection bus comprises an advanced extensible interface AXI interconnection bus, and the connection module comprises an AXI interface, a sending data buffer unit, a sending data coding and packaging unit, a receiving data buffer unit and a receiving data decoding and unpacking unit;
the AXI interface is used for acquiring first data to be sent from the AXI interconnection bus, storing the first data to be sent into the sending data caching unit, and sending second data in the receiving data caching unit to the AXI interconnection bus;
the sending data coding and packaging unit is used for packaging the first data in the sending data caching unit to obtain the first data packet;
the received data decoding and unpacking unit is used for analyzing the second data packet sent by the receiving switching module to obtain the second data, and storing the second data into the received data caching unit.
2. The graphics processor of claim 1, wherein the AXI interface is further configured to obtain a data start address of the first data from an AXI interconnect bus;
and reading the first data from the AXI interconnect bus according to the data start address.
3. The graphics processor of claim 1, wherein the AXI interface is further configured to read the second data from the received data cache unit;
and sending the second data to the position indicated by the destination address according to the destination address in the second data.
4. The graphics processor of claim 1, wherein the connection module further comprises: a timing generation unit for generating a video timing;
the sending data coding and packaging unit is further configured to send the first data packet to the sending switching module in a video standard format in the video time sequence valid interval.
5. A data transmission method applied to the graphics processor of any one of claims 1-4, the method comprising:
acquiring first data, wherein the first data is data which needs to be sent to another graphic processor;
packaging the first data to obtain a first data packet;
and transmitting the first data packet to a receiver of another connected graphics processor through a transmitter.
6. The method of claim 5, wherein the method further comprises:
receiving a second data packet sent by a sender of another connected graphics processor;
analyzing the second data packet to obtain second data and a destination address;
and transmitting the second data to the position indicated by the destination address.
7. The method of claim 5, wherein said packetizing said first data to obtain a first data packet comprises:
acquiring a preset data packet format;
and packaging the first data according to the preset data packet format.
8. The method of claim 7, wherein the predetermined data packet format comprises a header field, a packet payload data field, and a trailer field, wherein the header field is organized using 3 24-bit data, the packet payload data field is organized using 6 consecutive 24-bit data, and the trailer field comprises 1 24-bit data.
9. A data transmission apparatus for use with the graphics processor of any one of claims 1-4, said apparatus comprising:
the receiving module is used for acquiring first data, wherein the first data is data which needs to be sent to another graphic processor;
the processing module is used for packaging the first data to obtain a first data packet;
and the transmission module is used for transmitting the first data packet to a receiver of another connected graphics processor through a transmitter.
10. An electronic device comprising at least two graphics processors as claimed in any one of claims 1-4.
11. A computer readable storage medium having stored therein program code which is callable by a graphics processor to perform the method according to any one of claims 5 to 8.
CN202110229587.3A 2021-03-02 2021-03-02 Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium Active CN113051212B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110229587.3A CN113051212B (en) 2021-03-02 2021-03-02 Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110229587.3A CN113051212B (en) 2021-03-02 2021-03-02 Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113051212A CN113051212A (en) 2021-06-29
CN113051212B true CN113051212B (en) 2023-12-05

Family

ID=76509780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110229587.3A Active CN113051212B (en) 2021-03-02 2021-03-02 Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113051212B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861030B (en) * 2023-01-31 2023-07-25 南京砺算科技有限公司 Graphics processor, system variable generation method thereof and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102007479A (en) * 2008-03-31 2011-04-06 先进微装置公司 Peer-to-peer special purpose processor architecture and method
CN102118360A (en) * 2009-12-31 2011-07-06 联想(北京)有限公司 Data processing method, thin client system, server and thin clients
CN104901859A (en) * 2015-06-11 2015-09-09 东南大学 AXI/PCIE bus converting device
CN107124286A (en) * 2016-02-24 2017-09-01 深圳市知穹科技有限公司 A kind of mass data high speed processing, the system and method for interaction
CN107329927A (en) * 2016-04-28 2017-11-07 富泰华工业(深圳)有限公司 A kind of data-sharing systems and method
CN109739556A (en) * 2018-12-13 2019-05-10 北京空间飞行器总体设计部 A kind of general deep learning processor that interaction is cached based on multiple parallel and is calculated
CN109995943A (en) * 2019-03-28 2019-07-09 维沃移动通信有限公司 A kind of information processing method and terminal device
CN110113869A (en) * 2018-02-01 2019-08-09 纬创资通股份有限公司 Modular unit and its control method
CN110851376A (en) * 2019-10-21 2020-02-28 天津大学 PCIe interface design method based on FPGA
CN112328532A (en) * 2020-11-02 2021-02-05 长沙景嘉微电子股份有限公司 Multi-GPU communication method and device, storage medium and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9304730B2 (en) * 2012-08-23 2016-04-05 Microsoft Technology Licensing, Llc Direct communication between GPU and FPGA components

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102007479A (en) * 2008-03-31 2011-04-06 先进微装置公司 Peer-to-peer special purpose processor architecture and method
CN102118360A (en) * 2009-12-31 2011-07-06 联想(北京)有限公司 Data processing method, thin client system, server and thin clients
CN104901859A (en) * 2015-06-11 2015-09-09 东南大学 AXI/PCIE bus converting device
CN107124286A (en) * 2016-02-24 2017-09-01 深圳市知穹科技有限公司 A kind of mass data high speed processing, the system and method for interaction
CN107329927A (en) * 2016-04-28 2017-11-07 富泰华工业(深圳)有限公司 A kind of data-sharing systems and method
CN110113869A (en) * 2018-02-01 2019-08-09 纬创资通股份有限公司 Modular unit and its control method
CN109739556A (en) * 2018-12-13 2019-05-10 北京空间飞行器总体设计部 A kind of general deep learning processor that interaction is cached based on multiple parallel and is calculated
CN109995943A (en) * 2019-03-28 2019-07-09 维沃移动通信有限公司 A kind of information processing method and terminal device
CN110851376A (en) * 2019-10-21 2020-02-28 天津大学 PCIe interface design method based on FPGA
CN112328532A (en) * 2020-11-02 2021-02-05 长沙景嘉微电子股份有限公司 Multi-GPU communication method and device, storage medium and electronic device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于虚拟化环境的多GPU并行通用计算平台研究;徐恒;吴俊敏;杨志刚;尹燕;;计算机应用与软件(11) *

Also Published As

Publication number Publication date
CN113051212A (en) 2021-06-29

Similar Documents

Publication Publication Date Title
US11669481B2 (en) Enabling sync header suppression latency optimization in the presence of retimers for serial interconnect
CN102138297B (en) Graphics multi-media ic and method of its operation
US5598580A (en) High performance channel adapter having a pass through function
CN113498596B (en) PCIe-based data transmission method and device
CN100499666C (en) System and method for inter connecting SP14 equipment and PCI Express equipment
US20040085996A1 (en) High-throughput UART interfaces
CN113051212B (en) Graphics processor, data transmission method, data transmission device, electronic equipment and storage medium
CN104991883A (en) Sending and receiving apparatuses with chip interconnection and sending and receiving method and system
WO2021147051A1 (en) Data transmission method and apparatus based on pcie
US6208645B1 (en) Time multiplexing of cyclic redundancy functions in point-to-point ringlet-based computer systems
US9817784B2 (en) Multi-port transmitter device for transmitting at least partly redundant data, an associated control system, an associated method and an associated computer program product
CN111045817B (en) PCIe transmission management method, system and device
US11636061B2 (en) On-demand packetization for a chip-to-chip interface
US8599886B2 (en) Methods and apparatus for reducing transfer qualifier signaling on a two-channel bus
JP2023539315A (en) Image transmission method and device
US11934337B2 (en) Chip and multi-chip system as well as electronic device and data transmission method
US9268725B2 (en) Data transferring apparatus and data transferring method
CN118210749A (en) SerDes-based AXI3 bus inter-chip bridging method and system
CN103051898B (en) A kind of by the system and method for video card Obtaining Accurate without compressed audio and video data
CN117370239A (en) DMA task data transmission method, DMA controller and electronic equipment
CN118250487A (en) ARINC818-2 protocol and CamerLink protocol bidirectional conversion transmission method and related system
CN118172231A (en) Graphics processor communication device, graphics processor communication method, electronic device, and medium
KR20050060688A (en) Dual bus controlling device of the node-b in the umts using a high speed serial line
CN115658576A (en) PCIe and RapidIO composite task packet transmission system and method
CN117687947A (en) PCIe-based bit stream reading method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant