WO2016169032A1 - 数据格式转换装置、缓冲芯片及方法 - Google Patents

数据格式转换装置、缓冲芯片及方法 Download PDF

Info

Publication number
WO2016169032A1
WO2016169032A1 PCT/CN2015/077311 CN2015077311W WO2016169032A1 WO 2016169032 A1 WO2016169032 A1 WO 2016169032A1 CN 2015077311 W CN2015077311 W CN 2015077311W WO 2016169032 A1 WO2016169032 A1 WO 2016169032A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
converted
conversion
format conversion
data format
Prior art date
Application number
PCT/CN2015/077311
Other languages
English (en)
French (fr)
Inventor
柴守刚
梁文亮
庄良
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201580076244.0A priority Critical patent/CN107209663B/zh
Priority to PCT/CN2015/077311 priority patent/WO2016169032A1/zh
Publication of WO2016169032A1 publication Critical patent/WO2016169032A1/zh
Priority to US15/789,011 priority patent/US10402119B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
    • G06F9/3879Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor for non-native instruction execution, e.g. executing a command; for Java instruction set
    • G06F9/3881Arrangements for communication of instructions and data

Definitions

  • the embodiments of the present invention relate to computer technologies, and in particular, to a data format conversion device, a buffer chip, and a method.
  • GPGPU General Purpose Graphic Process Unit
  • Field Field Programmable Gate Array
  • FPGA Field Programmable Gate Array
  • the CPU and the acceleration calculation unit have respective storage units, and the memory of the CPU is generally defined as a main memory, and the memory of the acceleration calculation unit is defined as a device.
  • the memory is transmitted between the main memory and the device memory through a bus.
  • Iterative computation is a typical intensive computational task. In order to improve computational efficiency, iterative computation is usually assigned to an accelerated computational unit. Iterative calculations are usually applied to equations solving, matrix solving eigenvalues or Singular Value Decomposition (SVD). As shown in Figure 2, the basic idea of iterative computation is successive approximation, taking a rough initial value first, then Using the same iterative formula, the intermediate result is iteratively substituted into the iterative formula cycle until the result of the calculation converges to the accuracy requirement.
  • SVD Singular Value Decomposition
  • the data in the high-precision format is usually used for calculation, and in the process of data transmission, Data in high precision format is also used.
  • the high-precision data format is adopted in the accelerated calculation process and the data transmission process, the calculation accuracy is satisfied, but the data transmission amount is increased, and the increased data transmission amount increases the delay of data transmission, for the CPU. The overall calculation time has also increased.
  • the CPU will The low-precision data is transmitted to the acceleration calculation unit through the bus.
  • the acceleration calculation unit receives the data, the low-precision data is zero-padded into a high-precision data format, and then the calculation is performed; when the acceleration calculation unit needs to transmit the data to the CPU, the high-precision is first performed.
  • the data is converted to low-precision data and sent to the CPU via the bus.
  • Embodiments of the present invention provide a data format conversion apparatus, a buffer chip, and a method, so as to reduce the amount of data transmission between the main memory of the CPU and the device memory of the acceleration calculation unit, without additionally occupying the acceleration calculation unit and the CPU. Calculate resources to ensure calculation accuracy while improving computational efficiency.
  • an embodiment of the present invention provides a data format conversion apparatus.
  • the device is used for a buffer chip in a memory, the device comprising a control module and a conversion module;
  • the control module is configured to send, according to the received data copy command, a control instruction to the conversion module, where the data copy command includes information to be converted, a format conversion type, and an address of the data to be converted; the control The instruction is used to instruct the conversion module to perform data format conversion and storage address mapping on the to-be-converted data;
  • the conversion module is configured to complete data format conversion and storage address mapping of the data to be converted according to the received control instruction, and send the data to be converted after the data format conversion to an acceleration calculation unit .
  • the completing the data format conversion and storage address mapping to be converted includes: the converting module storing the data to be converted into the control when writing The format conversion type indicated by the instruction, and the storage address indicated by the control instruction.
  • control instruction includes a control word stored in the mode selection register;
  • control word includes a conversion mode switch control word and a format conversion type control word;
  • the conversion mode switch control word is used to instruct the conversion module to enable data format conversion of the data to be converted, or the conversion mode switch control word is used to instruct the conversion module to close the data to be converted.
  • Data format conversion is used to instruct the conversion module to enable data format conversion of the data to be converted, or the conversion mode switch control word is used to instruct the conversion module to close the data to be converted.
  • the format conversion type control word is used to indicate a data format type before the data to be converted, and a data format type after the data to be converted.
  • the control instruction includes a control code stored in the mode selection register;
  • the control code is used to instruct the conversion module to enable data format conversion of the data to be converted, or to disable data format conversion of the data to be converted, or the control code is used to indicate the data conversion to be converted.
  • the previous format conversion type and the data format type after the data to be converted is converted.
  • the converting module includes a data format converting unit and an address converting unit;
  • the data format conversion unit is configured to complete data format conversion of the data to be converted according to the control instruction sent by the control module.
  • the address conversion unit is configured to obtain, according to the control instruction, a storage address before the data to be converted, a data format type before the data to be converted, and a data format type after the data to be converted The storage address after the data to be converted is converted.
  • an embodiment of the present invention provides a buffer chip, including: a bus interface, an address buffer unit, a control cache unit, a data buffer unit, and the apparatus according to any of the first aspects.
  • an embodiment of the present invention provides a data format conversion method, including:
  • the bus interface of the buffer chip in the memory receives the data copy command sent by the CPU of the central processor, and caches the data to be converted obtained by the copy to the data buffer unit according to the data copy command, and caches the storage address of the data to be converted to An address buffering unit, configured to cache the format conversion type of the data to be converted to a control cache module; and send the data copy command to a control module of the data format conversion device;
  • the control module is configured to send, according to the received data copy command, a control instruction to a conversion module of the data format conversion device, where the data copy command includes the to-be-converted data, a format conversion type and information of an address of the data to be converted; the control instruction is used to instruct the conversion module to perform data format conversion and storage address mapping on the data to be converted;
  • the conversion module completes the data format conversion and the storage address mapping of the data to be converted according to the received control instruction, and sends the data to be converted after the data format conversion to the acceleration calculation unit.
  • control instruction includes a control word;
  • control word includes a conversion mode switch control word and a format conversion type control word;
  • the conversion mode switch control word is used to instruct the conversion module to enable data format conversion of the data to be converted, or the conversion mode switch control word is used to instruct the conversion module to close the to-be-converted Data format conversion of data;
  • the format conversion type control word is used to indicate a data format type before the data to be converted, and a data format type after the data to be converted.
  • control instruction includes a control code
  • the control code is used to instruct the conversion module to enable data format conversion of the data to be converted, or to disable data format conversion of the data to be converted, or the control code is used to indicate the data to be converted.
  • the converting module is configured to complete the data to be converted according to the received control instruction Data format conversion and storage address mapping, including:
  • the conversion module completes the data format of the data to be converted according to the format conversion type before the data to be converted and the data format type after the data to be converted are converted according to the control instruction sent by the control module. Conversion
  • the converting module obtains the to-be-received according to the control instruction, the storage address before the data to be converted, the data format type before the data to be converted, and the data format type after the data to be converted. Convert the storage address after the data is converted.
  • the data copy command is defined as:
  • MemCopy is a function name indicating the main memory of the CPU and the acceleration calculation a copy of the data to be converted between the device memories of the unit; a destination indicating a destination address of the data to be converted; a source indicating an original address of the data to be converted; a size indicating a size of the data to be converted; a direction indicating the Copy direction of the data to be converted: cpytype indicates the format conversion type.
  • the performing the to-be-converted data format conversion and the storage address mapping includes: converting The module stores the data to be converted as a format conversion type indicated by the control instruction at the time of writing, and is stored in a storage address indicated by the control instruction.
  • Embodiments of the present invention provide a data format conversion apparatus, a buffer chip, and a method.
  • the device is used for a buffer chip in the memory, and the control module of the data format conversion device is configured to send a control instruction to the conversion module when receiving the data copy command; the conversion module is configured to complete the data to be converted according to the received control instruction.
  • the data format is converted and stored in the address mapping, and the data to be converted after the data format conversion is completed is sent to the acceleration computing unit.
  • FIG. 1 is a schematic diagram of a computer structure with an accelerated computing unit in the prior art
  • Figure 2 is a schematic block diagram of an iterative calculation
  • FIG. 3 is a schematic diagram of an acceleration calculation process in the prior art
  • FIG. 4 is a schematic diagram of a computer structure including a data format conversion apparatus according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an acceleration calculation process according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a buffer chip according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a mode selection register in a data format conversion apparatus on a buffer chip according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram 1 of storage address mapping in a data format conversion process according to an embodiment of the present invention.
  • FIG. 9 is a second schematic diagram of storage address mapping in a data format conversion process according to an embodiment of the present invention.
  • 10a is a schematic structural diagram of another buffer chip according to an embodiment of the present invention.
  • FIG. 10b is a flowchart of a data format conversion method according to an embodiment of the present invention.
  • the embodiment of the present invention provides a solution to the contradiction between the data transmission efficiency and the calculation precision, and can realize the direct conversion of the data format in the process of data transmission, which does not occupy additional computing resources, especially no additional occupation.
  • the computational resources of the accelerator improve the overall computational efficiency of the accelerated computing system while ensuring computational accuracy.
  • the solution provided by the embodiment of the present invention can replace the data format conversion scheme mentioned in the foregoing background, and solve the contradiction between the data transmission efficiency and the calculation precision, and can be applied to any scenario that needs to perform data format conversion, for example, with an accelerator.
  • the memory of the accelerated computing unit is a Synchronous Dynamic Random Access Memory (SDRAM), and the SDRAM memory module is composed of a plurality of memory chips arranged on a Printed Circuit Board (PCB).
  • SDRAM Synchronous Dynamic Random Access Memory
  • PCB Printed Circuit Board
  • the buffer chip buffers and shapes the signals of the data bus, the address bus and the control bus.
  • the buffer chip is improved in the embodiment of the invention, and the data format is directly converted in the process of data transmission.
  • the present invention implements the data format conversion process in the device memory. Enables memory to perform data format conversion while reading and writing data.
  • the data format conversion device is disposed on the buffer chip of the device memory of the acceleration calculation unit, and the data format conversion function is implemented on the device memory during the process of reading and writing data, and the acceleration calculation process at this time is as follows.
  • the data format can be converted from a low-precision format type to a high-precision format type while receiving data, and the data format can be converted from a high-precision format type to a low-precision format type while transmitting data, regardless of the received data.
  • Still sending data does not occupy the computing resources of the accelerated computing unit itself, and can ensure that low-precision data with a small amount of data is always transmitted on the bus between the device memory and the main memory.
  • FIG. 6 is a schematic structural diagram of a buffer chip according to an embodiment of the present invention.
  • the buffer chip 20 provided in this embodiment may specifically include: a bus interface 204, an address buffer unit 203, a control buffer unit 202, a data buffer unit 201, and a data format conversion device 10;
  • the bus interface 204 is configured to receive a data copy command sent by the CPU, and cache the copied data to be converted to the data cache unit 201 according to the data copy command, and cache the storage address of the data to be converted to the address cache unit.
  • the format conversion type of the data to be converted is cached to the control cache unit 202; and the data copy command is sent to the data format conversion device 10;
  • the data format conversion device 10 includes a control module 101 and a conversion module 102;
  • the control module 101 is configured to: when the data copy command is received, acquire the data to be converted, the format conversion type, and the address of the data to be converted, and send a control instruction to the conversion module 102; the control instruction is used to instruct the conversion module 102 to convert the data.
  • the conversion module 102 is configured to complete data format conversion and storage address mapping of the data to be converted according to the received control instruction, and send the data to be converted after the data format conversion is completed to the acceleration. Calculation unit.
  • the conversion module 102 directly stores the data to be converted into a format conversion type indicated by the control instruction when writing, and stores the storage address indicated by the control instruction, so that the data to be converted is stored in the data conversion device. After the formatted data is converted, the formatted data can be directly obtained when reading. Thereby, the data format conversion is completed directly in the process of data reading and writing.
  • control module 101 may be a mode selection register, and the mode selection register is disposed in the buffer chip 20, and all control words or control codes required for control are pre-stored in the mode selection register.
  • the buffer chip 20 is a buffer chip in the memory.
  • the above data format conversion can be realized in the process of reading and writing to the memory, without occupying additional computing resources of the accelerator, and improving the calculation efficiency.
  • control instruction includes a control word stored in the mode selection register; the control word includes a conversion mode switch control word and a format conversion type control word; and the conversion mode switch control word is used to instruct the conversion module 102 to be turned on. Data format conversion of the data to be converted, or the conversion mode switch control word is used to instruct the conversion module 102 to close the data format conversion of the data to be converted; the format conversion type control word is used to indicate the data format type before the data to be converted, and to be Convert the data format type after data conversion.
  • Type control words for example, the format conversion type control words can be set to S2I, S2F, I2L, F2D, etc., wherein S2I represents a short type to an int type, S2F represents a short type to a float type, and I2L represents an int type to a long int type. , F2D means float type to double type.
  • S2I represents a short type to an int type
  • S2F represents a short type to a float type
  • I2L represents an int type to a long int type.
  • F2D means float type to double type.
  • the format conversion type control word S2I when the format conversion type control word S2I is set to 1, the data to be converted is The data format is converted from the short type to the int type, and at this time, the format conversion type control words S2F, I2L, and F2D are all set to zero.
  • control instruction includes all control codes pre-stored in the mode selection register; the control code is used to instruct the conversion module 102 to turn on the data to be converted.
  • the data format conversion, or, the data format conversion of the data to be converted is turned off, or the control code is used to indicate the format conversion type before the data to be converted and the data format type after the data to be converted.
  • control code 0000 can be used to indicate that the data format conversion of the data to be converted is turned off, so that the control code 1111 indicates that the data format conversion of the data to be converted is turned on, so that the control code 0001 indicates that the short type is int type, and the control code is 0010. It means that the short type is changed to the float type, so that the control code 0011 represents the int type to the long int type. This embodiment does not limit this.
  • control instructions can be passed from the application to the mode select register in two ways.
  • a dedicated configuration line is added to the memory stick, and the configuration line is added to the control line of the bus, and the user can write the control command through the configuration line.
  • the hardware interface of the memory stick needs to be performed.
  • the second modification is to modify the protocol of the control line by using the existing control line in the memory or the existing control line in the multiplexed memory, and increase the transmission process of the control instruction in the existing control line protocol.
  • the embodiment provides a unified data copy command, which is actually an interface function, and encapsulates the functions of the data format conversion device 10, when the user actually uses , only need to call the interface function, that is, you can achieve data conversion.
  • the interface function that is, the data copy command is defined as:
  • MemCopy is a function name, indicating a copy of the data to be converted between main memory and device memory; Destination indicates the destination address of the data to be converted; source indicates the original address of the data to be converted; size indicates the size of the data to be converted; Copy data conversion direction: from main memory to device memory, from device memory to main memory, or from one device memory to another device memory; Cpytype indicates format conversion type, such as S2I, S2F, I2L, etc., corresponding to control instructions .
  • the conversion module 102 may convert the data to be converted from a low-precision format type to a high-precision format type according to the received control instruction; or convert the data to be converted from a high-precision format type to Low-precision format type; at the same time of data format conversion, since the data format type changes, the required storage space changes, and accordingly, the conversion module 102 further completes the storage address mapping of the data to be converted according to the control instruction. .
  • the conversion module 102 can include a data format conversion unit and an address translation unit;
  • the data format conversion unit is configured to complete data format conversion of the data to be converted according to the control instruction sent by the control module 101;
  • the address conversion unit is configured to: according to the control instruction, the storage address before the data to be converted, and the data to be converted before conversion
  • the data format type and the data format type after the data to be converted are converted, and the storage address after the data to be converted is obtained.
  • the data format conversion unit if the data to be converted is converted from a low-precision format type to a high-precision format type, the data format conversion unit correspondingly increases the bit width of the data format, the high-order portion is zero-padded, and the low-order portion remains original. Data; if the data to be converted is converted from a high-precision format type to a low-precision format type, the data format conversion unit correspondingly reduces the bit width of the data format, and the low-order portion can be directly cut off.
  • the data format type changes, its storage space also changes accordingly: if the data to be converted is converted from a low-precision format type to a high-precision format type, the required storage space increases; the data to be converted is converted from a high-precision format type. For the low-precision format type, the required storage space is reduced, so the storage address corresponding to each data will change.
  • the address translation unit can convert the storage address before the conversion according to the data to be converted, before and after the conversion.
  • the data format type calculation obtains the corresponding storage address after each data to be converted.
  • the acceleration calculation unit is a GPGPU.
  • the CPU allocates the SVD decomposition process to the GPGPU.
  • SVD calculation process a total of two data exchanges are required: First, when the calculation starts, the initial value is from the main memory. Transfer to device memory; second, after the calculation is finished, transfer the calculation result from device memory to main memory.
  • main memory transmits and receives both short data
  • GPGPU calculations use double data. That is to say, the initial value of the short type needs to be converted to double type data at the beginning of the calculation, and the double type calculation result needs to be converted to the short type at the end of the calculation.
  • the two data transmission processes only need to call the interface function provided by the embodiment twice.
  • the interface function is called: MemCopy (destination, source, size, Host2Device, short2double); assume that the storage address corresponding to the short type data stored in the main memory is 0 and 1, respectively, in the short type data conversion
  • the storage addresses of the double type data stored in the device memory are mapped to 0 and 4, respectively, as shown in FIG. 8;
  • the interface function is called: MemCopy (destination, source, size) , Device2Host, double2short), the storage address of the double type data stored in the device memory is 0 and 4, respectively, after the double type data is converted into the short type data, and stored in the main memory.
  • the storage addresses of the short type data are mapped to 0 and 1, respectively, as shown in FIG.
  • the acceleration calculation unit of the data format conversion device 10 can implement the data format by the data format conversion device 10 in the buffer chip on the memory of the acceleration calculation unit when receiving the data. Converting from a low-precision format type to a high-precision format type, and when transmitting data, the data format can be converted from a high-precision format type to a low-precision format type by the data format conversion device 10 in the buffer chip on the memory of the acceleration calculation unit. It can be seen that whether the data is transmitted or received, the computing resources of the accelerator itself are not occupied, and it is ensured that the data transmitted on the bus between the main memory and the device memory is always low-precision data with a small amount of data.
  • the short type data to the float type data in the GPU needs 300 us, and the float type data is transferred.
  • the short type data also needs 300us.
  • the data format conversion device 10 of this embodiment if implemented in the memory buffer, the data read and write process can be implemented in only a few clock cycles, and the clock frequency of the memory module is 1600 MHz, then the short type Data to float type data, float type data to short type data can be realized in the ns order time. It can be seen that the data format conversion apparatus 10 provided in this embodiment can greatly shorten the overall calculation time of the data.
  • the module for performing data format conversion in the buffer chip in the memory may include three parts: a data format conversion, an address conversion, and a control module, as shown in FIG. 10a, wherein the buffer further includes a bus interface, a data cache, and Control cache, address cache.
  • the buffer further includes a bus interface, a data cache, and Control cache, address cache.
  • Data format conversion module It is used for data format conversion, converting low precision to high precision when writing, and converting high precision to low precision when reading.
  • Address translation module It is used for mapping of address space to realize mapping between CPU memory space and accelerator memory space.
  • Control module It is used as a control module to control the data format conversion process, mainly including whether to perform format conversion and format conversion.
  • the data format conversion module completes the conversion of the data format according to the instruction of the control module, and the specific The working mode can be divided into three types:
  • Transfer data from low precision to high precision increase the bit width of the data format, fill the zeros in the upper part, and keep the original data in the lower part;
  • the selection of the working mode of the module, the specific data format conversion mode, and the converted data type are all implemented according to the control word in the control module instruction.
  • the function of the address conversion module is to complete the mapping of the data address space before and after the data format conversion according to the instruction of the control module: that is, the address of each data after conversion is calculated according to the address of the data before the conversion and the data format before and after the conversion.
  • the function of the control module is to implement the control of the data format conversion module and the address conversion module.
  • the specific implementation manner may be implemented by adding a mode selection register to the buffer chip, and the register contains all the required control words.
  • the register is selected using a similar mode as in FIG.
  • Con is the conversion mode switch control word
  • S2I, S2F, I2L, F2D, etc. respectively represent the format conversion type, such as S2I means short type to int type S2F indicates a short type to a float type, I2L indicates an int type to a long int type, and F2D indicates a float type to a double type.
  • the control word can be set by enumeration.
  • the control word is 1 to indicate the corresponding format type conversion, and the other control words are zero.
  • the above mode selection register can also be implemented in other manners, for example, by encoding, 0000 means no format conversion, 0001 means short type int type, 0010 means short type to float type, 0011 means int type type Long int type and so on.
  • the address conversion module can be integrated into the data format conversion module, thereby directly using the data format conversion module to read and write data to implement data format conversion in the memory, thereby eliminating Occupy accelerator computing resources to improve overall computing efficiency.
  • control module implements control of the data format conversion module and the address conversion module by using a control word in the mode selection register, and the present invention proposes how the control word is transferred from the application to the mode selection register.
  • a dedicated configuration line is added to the memory stick, and the configuration line is added to the control line of the bus, and the user writes the control word through the dedicated configuration line.
  • This method requires modifying the hardware interface of the memory module.
  • the second method is to use the existing control lines in the memory, modify the protocol of the control line, and increase the control word transmission process. This method does not need to modify the hardware interface of the memory stick, and the implementation is relatively easy.
  • the embodiment of the present invention further provides an API interface corresponding to the data format conversion, which is used to provide a unified user interface, and the user invokes the conversion module proposed by the present invention to implement data format conversion.
  • the buffer chip includes a bus interface, an address buffer unit, a control buffer unit and a data buffer unit, and a data format conversion device, and the bus interface is configured to receive a data copy command sent by the CPU, and forward the data format conversion.
  • the device caches the to-be-converted data obtained by the copy to the data buffer unit according to the data copy command, and caches the storage address and the format conversion type of the data to be converted into an address buffer unit and a control buffer unit, respectively, and controls the data format conversion device.
  • the module is configured to send a control instruction to the conversion module when receiving the data copy command; the conversion module is configured to complete data format conversion and storage address mapping of the data to be converted according to the received control instruction, and complete the data format conversion
  • the data to be converted is sent to the acceleration calculation unit.
  • FIG. 10b is a flowchart of a data format conversion method according to an embodiment of the present invention. As shown in FIG. 10b, the method provided in this embodiment may be specifically implemented by the buffer chip 20 provided in the foregoing embodiment. The method provided in this embodiment may specifically include:
  • the bus interface of the buffer chip receives a data copy command sent by the CPU of the central processing unit, and caches the data to be converted obtained by the copy to the data buffer unit according to the data copy command.
  • the storage address of the data to be converted is cached to the address buffer unit, and the format conversion type of the data to be converted is cached to the control cache module; and the data copy command is sent to the control module of the data format conversion device;
  • control module when the control module receives the data copy command, acquiring the to-be-converted data, the format conversion type, and the address of the data to be converted, and sending a control instruction to the conversion module of the data format conversion device.
  • the control instruction is used to instruct the conversion module to perform data format conversion and storage address mapping on the data to be converted;
  • control module is configured to send, according to the received data copy command, a control instruction to a conversion module of the data format conversion device, where the data copy command includes the to-be-converted data, the format conversion type, and the Determining, by the conversion module, data format conversion and storage address mapping on the data to be converted;
  • the conversion module completes data format conversion and storage address mapping of the data to be converted according to the received control instruction, and sends the data to be converted after the data format conversion to an acceleration calculation unit. .
  • the conversion module when performing data format conversion, completes the data format conversion of the data to be converted according to the format conversion type before the data to be converted and the data format type after the data to be converted are converted according to the control instruction sent by the control module.
  • the conversion module obtains the data to be converted after the conversion of the data to be converted according to the control instruction, the storage address before the data to be converted, the data format type before the data to be converted, and the data format after the data to be converted. Storage address.
  • control instruction includes a control word; the control word includes a conversion mode switch control word and a format conversion type control word; wherein, the conversion mode switch control word is used to instruct the conversion module to open the data format conversion of the data to be converted, or the conversion mode The switch control word is used to instruct the conversion module to close the data format conversion of the data to be converted; the format conversion type control word is used to indicate the data format type before the data to be converted, and the data format type after the data to be converted.
  • control instruction includes a control code; the control code is used to instruct the conversion module to open the data format conversion of the data to be converted, or to close the data format conversion of the data to be converted, or the control code is used to indicate the data to be converted before conversion.
  • MemCopy is a function name indicating a copy of the data to be converted between a main memory of the CPU and a device memory of the acceleration calculation unit; a destination indicating a destination address of the data to be converted; a source indicating the to-be-converted The original address of the data; size indicates the size of the data to be converted; direction indicates the copy direction of the data to be converted: cpytype indicates the format conversion type.
  • the data format conversion and the storage address mapping of the data to be converted may include: directly storing the data to be converted into a format conversion type indicated by the control instruction at the time of writing, and storing the storage address indicated by the control instruction, such that In the data conversion device, the data to be converted is stored into the format-converted data, and when the data is read, the format-converted data can be directly obtained. Thereby, the data format conversion is completed directly in the process of data reading and writing.
  • the buffer chip is a buffer chip in the memory.
  • the above data format conversion can be realized in the process of reading and writing to the memory, without occupying additional computing resources of the accelerator, and improving the calculation efficiency.
  • the data format conversion method provided by this embodiment can be used to implement the technical solution of the foregoing device embodiment, and the implementation principle is similar to the technical effect, and details are not described herein again.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or hard.
  • the form is implemented in the form of a software functional unit.
  • the above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium.
  • the above software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明实施例提供一种数据格式转换装置、缓冲芯片及方法。通过将数据格式转换装置设置在内存中的缓冲芯片上,设置用于在获取到数据拷贝命令时,向转换模块发送控制指令的控制模块;以及用于根据接收到的控制指令,完成待转换数据的数据格式转换和存储地址映射的转换模块,避免了现有技术中在加速计算单元中设置数据格式转换单元而导致的额外占用加速计算单元的计算资源和计算时间的问题,减小了主存储器和设备存储器之间的数据传输量,不额外占用计算资源,在保证了计算精度的同时提升了计算效率。

Description

数据格式转换装置、缓冲芯片及方法 技术领域
本发明实施例涉及计算机技术,尤其涉及一种数据格式转换装置、缓冲芯片及方法。
背景技术
在高性能的计算机体系结构中,通常都设置有对密集型计算任务有很强处理能力的加速计算单元,如通用图形处理器(General Purpose Graphic Process Unit,GPGPU)和现场可编程门阵列(Field-Programmable Gate Array,FPGA)。在处理密集型计算任务时,中央处理单元(Central Processing Unit,CPU)将大量的并行计算工作分配到加速计算单元中,以缓解CPU的计算压力,提高计算机的整体计算效率。
如图1所示,在包含加速计算单元的计算机结构中,CPU和加速计算单元拥有各自的存储单元,通常将CPU的存储器定义为主存储器(main memory),将加速计算单元的存储器定义为设备存储器(device memory),main memory和device memory之间通过总线(bus)实现数据传输。
迭代计算是一种典型的密集型计算任务,为了提高计算效率,通常将迭代计算分配到加速计算单元中实现。迭代计算通常应用于方程组求解、矩阵求解特征值或奇异值分解(Singular Value Decomposition,SVD)等方面,如图2所示,迭代计算的基本思想是逐次逼近,先取一个粗糙的初始值,然后使用同一个迭代公式,将中间结果反复代入该迭代公式循环计算,直至计算结果收敛到精度要求为止。
由于迭代计算这类密集型计算任务对于中间结果的数据精度要求很高,因此,为了计算能够有效收敛,在加速计算过程中,通常采用高精度格式的数据进行计算,同时在数据传输过程中,也采用高精度格式的数据。虽然在加速计算过程和数据传输过程中均采用了高精度数据格式,满足了计算精度的要求,但是增加了数据传输量,增加的数据传输量会使数据传输的时延增加,对于CPU来说整体的计算时间也增加了。
现有的技术方案中,如图3所示,通过在加速计算单元中增加两个数据格式转换单元,即,高精度数据转低精度数据单元和低精度数据转高精度数据单元,使CPU将低精度数据通过总线传递给加速计算单元,加速计算单元接收数据后将低精度数据补零转换成高精度数据格式,然后进行计算;当加速计算单元需要将数据传输给CPU时,先将高精度数据转换成低精度数据,再通过总线发送给CPU。
虽然上述现有技术的方案通过传递低精度数据的方式减小了数据传输量,进而减小了数据传输的时延,但由于在加速计算单元中增加了两个数据格式转换单元,需要额外占用加速计算单元的计算资源和计算时间进行数据格式转换,因此,降低了加速计算的效率。
发明内容
本发明实施例提供一种数据格式转换装置、缓冲芯片及方法,以在减小CPU的主存储器和加速计算单元的设备存储器之间的数据传输量的同时,不额外占用加速计算单元和CPU的计算资源,保证计算精度的同时提升计算效率。
第一方面,本发明实施例提供一种数据格式转换装置,
所述装置用于内存中的缓冲芯片,所述装置包括控制模块和转换模块;
所述控制模块用于根据接收到的所述数据拷贝命令,向所述转换模块发送控制指令,所述数据拷贝命令包含待转换数据、格式转换类型以及待转换数据的地址的信息;所述控制指令用于指示所述转换模块对所述待转换数据进行数据格式转换和存储地址映射;
所述转换模块用于根据接收到的所述控制指令,完成所述待转换数据的数据格式转换和存储地址映射,并将完成所述数据格式转换后的所述待转换数据发送给加速计算单元。
在第一方面的第一种可能的实现方式中,所述完成所述待转换数据格式转换和存储地址映射包括:所述转换模块在写入时将所述待转换的数据存储成所述控制指令指示的格式转换类型,以及存储在所述控制指令指示的存储地址。
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第二 种可能的实现方式中,所述控制指令中包含存储在所述模式选择寄存器中控制字;所述控制字包括转换模式开关控制字和格式转换类型控制字;
所述转换模式开关控制字用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,所述转换模式开关控制字用于指示所述转换模块关闭对所述待转换数据的数据格式转换;
所述格式转换类型控制字用于指示所述待转换数据转换前的数据格式类型,以及所述待转换数据转换后的数据格式类型。
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第三种可能的实现方式中,所述控制指令中包含存储在所述模式选择寄存器中的控制编码;所述控制编码用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,关闭所述待转换数据的数据格式转换,或者,所述控制编码用于指示所述待转换数据转换前的格式转换类型和所述待转换数据转换后的数据格式类型。
结合第一方面至第一方面的第三种可能的实现方式,在第一方面的第四种可能的实现方式中,所述转换模块包括数据格式转换单元和地址转换单元;
其中,所述数据格式转换单元用于根据所述控制模块发送的所述控制指令,完成所述待转换数据的数据格式转换;
所述地址转换单元用于根据所述控制指令、所述待转换数据转换前的存储地址、所述待转换数据转换前的数据格式类型、以及所述待转换数据转换后的数据格式类型,获得所述待转换数据转换后的存储地址。
第二方面,本发明实施例提供一种缓冲芯片,包括:总线接口、地址缓存单元、控制缓存单元、数据缓存单元,以及如上第一方面任一所述的装置。
第三方面,本发明实施例提供一种数据格式转换方法,包括:
内存中的缓冲芯片的总线接口接收中央处理器CPU发送的数据拷贝命令,根据所述数据拷贝命令,将拷贝得到的待转换数据缓存至数据缓存单元,将所述待转换数据的存储地址缓存至地址缓存单元,将所述待转换数据的格式转换类型缓存至控制缓存模块;并将所述数据拷贝命令发送给数据格式转换装置的控制模块;
所述控制模块用于根据接收到的所述数据拷贝命令,向数据格式转换装置的转换模块发送控制指令,所述数据拷贝命令包含所述待转换数据、所述 格式转换类型以及所述待转换数据的地址的信息;所述控制指令用于指示所述转换模块对所述待转换数据进行数据格式转换和存储地址映射;
所述转换模块根据接收到的所述控制指令,完成所述待转换数据的数据格式转换和存储地址映射,并将完成所述数据格式转换后的所述待转换数据发送给加速计算单元。
在第三方面的第一种可能的实现方式中,所述控制指令中包含控制字;所述控制字包括转换模式开关控制字和格式转换类型控制字;
其中,所述转换模式开关控制字用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,所述转换模式开关控制字用于指示所述转换模块关闭对所述待转换数据的数据格式转换;
所述格式转换类型控制字用于指示所述待转换数据转换前的数据格式类型,以及所述待转换数据转换后的数据格式类型。
在第三方面的第二种可能的实现方式中,所述控制指令中包含控制编码;
所述控制编码用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,关闭所述待转换数据的数据格式转换,或者,所述控制编码用于指示所述待转换数据转换前的格式转换类型和所述待转换数据转换后的数据格式类型。
结合第三方面至第三方面的第二种可能的实现方式,在第三方面的第三种可能的实现方式中,所述转换模块根据接收到的所述控制指令,完成所述待转换数据的数据格式转换和存储地址映射,包括:
所述转换模块根据所述控制模块发送的所述控制指令中,所述待转换数据转换前的格式转换类型和所述待转换数据转换后的数据格式类型,完成所述待转换数据的数据格式转换;
所述转换模块根据所述控制指令、所述待转换数据转换前的存储地址、所述待转换数据转换前的数据格式类型、以及所述待转换数据转换后的数据格式类型,获得所述待转换数据转换后的存储地址。
结合第三方面至第三方面的第三种可能的实现方式,在第三方面的第四种可能的实现方式中,所述数据拷贝命令定义为:
MemCopy(destination,source,size,direction,cpytype)
其中,MemCopy是函数名,表示所述CPU的主存储器和所述加速计算 单元的设备存储器之间所述待转换数据的拷贝;destination表示所述待转换数据的目的地址;source表示所述待转换数据的原地址;size表示所述待转换数据的大小;direction表示所述待转换数据的拷贝方向:cpytype表示所述格式转换类型。
结合第三方面至第三方面的第四种可能的实现方式,在第三方面的第五种可能的实现方式中,所述完成所述待转换数据格式转换和存储地址映射包括:所述转换模块在写入时将所述待转换的数据存储成所述控制指令指示的格式转换类型,以及存储在所述控制指令指示的存储地址。
本发明实施例提供数据格式转换装置、缓冲芯片及方法。该装置用于内存中的缓冲芯片,该数据格式转换装置的控制模块用于在接收到数据拷贝命令时,向转换模块发送控制指令;转换模块用于根据接收到的控制指令,完成待转换数据的数据格式转换和存储地址映射的,并将完成数据格式转换后的待转换数据发送给加速计算单元。通过将数据格式转换装置设置在缓冲芯片上,避免了现有技术中在加速计算单元中设置两个数据格式转换单元而导致的额外占用加速计算单元的计算资源和计算时间的问题,实现了在减小CPU的主存储器和加速计算单元的设备存储器之间的数据传输量的同时,不额外占用加速计算单元和CPU的计算资源,保证计算精度的同时提升计算效率。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为现有技术中具有加速计算单元的计算机结构示意图;
图2为迭代计算的原理框图;
图3为现有技术中加速计算流程的示意图;
图4为包含本发明实施例提供的数据格式转换装置的计算机结构的示意图;
图5为本发明实施例中加速计算流程的示意图;
图6为本发明实施例提供的缓冲芯片的结构示意图;
图7为本发明实施例提供的缓冲芯片上的数据格式转换装置中模式选择寄存器的示意图;
图8为本发明实施例进行数据格式转换过程中存储地址映射示意图一;
图9为本发明实施例进行数据格式转换过程中存储地址映射示意图二;
图10a为本发明实施例提供的另一种缓冲芯片的结构示意图;
图10b为本发明实施例提供的数据格式转换方法的流程图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
如上述背景技术中介绍的,在现有的包含加速计算单元的加速计算系统中,存在数据传输效率和计算精度之间的矛盾,特别的,是CPU和加速计算单元之间的数据传输效率和计算精度之间的矛盾。因此,本发明实施例针对数据传输效率和计算精度之间的矛盾提出了解决方案,可以实现在数据传输的过程中直接实现数据格式的转换,既不占用额外的计算资源,特别是不额外占用加速器的计算资源,在保证计算精度的同时提升加速计算系统的整体计算效率。
本发明实施例提供的方案可以替代上述背景技术中提到的数据格式转换方案,解决数据传输效率和计算精度之间的矛盾,且可以应用于任何需要进行数据格式转换的场景,比如:带加速器的基站平台基带处理部分、用户终端基带处理部分,并行或分布式计算系统的数据交换部分等。
通常加速计算单元的存储器为同步动态随机存取内存(Synchronous Dynamic Random-Access Memory,SDRAM),该SDRAM内存条是若干个存储芯片排列在一块印制电路板(Printed Circuit Board,PCB)上构成的,而由于SDRAM工作频率较高,信号的传输衰减较大,因此通常在PCB上增加 buffer芯片对数据总线、地址总线和控制总线的信号进行缓冲和整形,本发明实施例对该buffer芯片进行了改进,在数据传输的过程中直接实现数据格式的转换。也就是说,为了保证数据传输效率,应该传输低精度的数据;为了保证整体计算效率,数据格式转换不应该占用加速器的计算资源,因此本发明是将数据格式转换过程放在设备内存中实现,使内存在读写数据的同时完成数据格式转换的任务。
结合图4,对包含本发明实施例提供的数据格式转换装置的计算机结构进行说明。在本发明实施例中,将数据格式转换装置设置在加速计算单元的设备存储器的缓冲芯片上,在数据读写的过程中在设备存储器上实现数据格式转换的功能,此时的加速计算流程如图5所示,在接收数据的同时可以实现数据格式从低精度格式类型转换为高精度格式类型,在发送数据的同时可以实现数据格式从高精度格式类型转换为低精度格式类型,无论接收数据还是发送数据,都不占用加速计算单元本身的计算资源,而且可以保证设备存储器和主存储器之间的总线上传输的总是数据量较小的低精度数据。
下面通过具体的实施例对本发明实施例提供的技术方案进行说明。
图6为本发明实施例提供的缓冲芯片的结构示意图。参照图6,本实施例提供的缓冲(buffer)芯片20具体可以包括:总线接口204、地址缓存单元203、控制缓存单元202、数据缓存单元201,以及数据格式转换装置10;
其中,总线接口204用于接收CPU发送的数据拷贝命令,根据该数据拷贝命令,将拷贝得到的待转换数据缓存至该数据缓存单元201,将该待转换数据的存储地址缓存至该地址缓存单元203,将该待转换数据的格式转换类型缓存至该控制缓存单元202;并将该数据拷贝命令发送给该数据格式转换装置10;
该数据格式转换装置10包括控制模块101和转换模块102;
该控制模块101用于在接收到该数据拷贝命令时,获取待转换数据、格式转换类型以及待转换数据的地址,并向转换模块102发送控制指令;该控制指令用于指示转换模块102对待转换数据进行数据格式转换和存储地址映射;
该转换模块102用于根据接收到的控制指令,完成待转换数据的数据格式转换和存储地址映射,并将完成数据格式转换后的待转换数据发送给加速 计算单元。
可选地,转换模块102在写入时直接将待转换的数据存储成控制指令指示的格式转换类型,以及存储在控制指令指示的存储地址,这样,在数据转换装置中待转换数据就被存储成格式转换后的数据了,在读出时,就可以直接得到格式转换后的数据了。从而直接在数据读取写入的过程中完成数据格式转换。
在实际应用中,具体的,该控制模块101可以是模式选择寄存器,该模式选择寄存器设置在buffer芯片20中,该模式选择寄存器中预先存储了控制所需的所有控制字或者控制编码。
其中,缓冲芯片20是内存中的缓冲芯片。上述数据格式转换在对内存的读写过程中就可实现,无需占用加速器的额外计算资源,提高了计算效率。
一种可行的实施方式中,控制指令中包含的是存储在模式选择寄存器中控制字;控制字包括转换模式开关控制字和格式转换类型控制字;转换模式开关控制字用于指示转换模块102开启对待转换数据的数据格式转换,或者,转换模式开关控制字用于指示转换模块102关闭对待转换数据的数据格式转换;格式转换类型控制字用于指示待转换数据转换前的数据格式类型,以及待转换数据转换后的数据格式类型。
参照图7,本实施例中,可以将转换模式开关控制字设置为Con,根据不同的Con的值,可以表示开启或关闭对待转换数据的数据格式转换,例如:当Con=1表示开启对待转换数据的数据格式转换,当Con=0表示关闭对待转换数据的数据格式转换;由于计算机中的数据格式类型只有不多的几种,因此可以将所有的数据格式类型枚举出来,设置各个格式转换类型控制字,例如,可以将格式转换类型控制字分别设置为S2I、S2F、I2L、F2D等,其中S2I表示short型转int型,S2F表示short型转float型,I2L表示int型转long int型,F2D表示float型转double型等。当格式转换类型控制字其中的一个置为1时表示进行相应的格式类型转换,其他的格式转换类型控制字置为0,例如,格式转换类型控制字S2I置为1时,将待转换数据的数据格式由short型转换为int型,而此时,格式转换类型控制字S2F、I2L、F2D均置为0。
另一种可行的实施方式中,控制指令中包含的是预先存储在模式选择寄存器中的所有控制编码;控制编码用于指示转换模块102开启对待转换数据 的数据格式转换,或者,关闭对待转换数据的数据格式转换,或者,控制编码用于指示待转换数据转换前的格式转换类型和待转换数据转换后的数据格式类型。
例如,实际使用时,可以令控制编码0000表示关闭对待转换数据的数据格式转换,令控制编码1111表示开启对待转换数据的数据格式转换,令控制编码0001表示short型转int型,令控制编码0010表示short型转float型,令控制编码0011表示int型转long int型。本实施例对此不进行限制。
在本实施例中,可以通过两种方法将控制指令从应用程序传递到模式选择寄存器中。一是在内存条上增加专用的配置线,将该配置线增加到总线的控制线中,用户可以通过该配置线实现控制指令的写入,在该方法中,需要对内存条的硬件接口进行修改;二是利用内存中已有的控制线或复用内存中已有的控制线,修改控制线的协议,在已有的控制线协议中增加控制指令的传输过程。
进一步地,为了方便用户的编程使用,本实施例提供一种统一的数据拷贝命令,该数据拷贝命令实际上是一个接口函数,将上述数据格式转换装置10的功能进行封装,用户在实际使用时,只需要调用该接口函数,即可以实现数据转换。本实施例中,将该接口函数,即数据拷贝命令定义为:
MemCopy(destination,source,size,direction,cpytype)
其中,MemCopy是函数名,表示main memory和device memory之间待转换数据的拷贝;Destination表示待转换数据的目的地址;source表示待转换数据的原地址;size表示待转换数据的大小;direction表示待转换数据的拷贝方向:从main memory到device memory、从device memory到main memory、或者从一个device memory到另一个device memory;Cpytype表示格式转换类型,如S2I、S2F、I2L等,与控制指令相对应。
需要说明的是,本实施例中,转换模块102可以根据接收到的控制指令,将待转换数据从低精度格式类型转换为高精度格式类型;或是将待转换数据从高精度格式类型转换为低精度格式类型;在数据格式转换的同时,由于数据格式类型发生变化,其所需的存储空间会发生变化,因此相应的,转换模块102还要根据控制指令,完成待转换数据的存储地址映射。
具体的,转换模块102可以包括数据格式转换单元和地址转换单元;其 中,数据格式转换单元用于根据控制模块101发送的控制指令,完成待转换数据的数据格式转换;地址转换单元用于根据控制指令、待转换数据转换前的存储地址、待转换数据转换前的数据格式类型、以及待转换数据转换后的数据格式类型,获得待转换数据转换后的存储地址。
在实际的数据格式转换过程中,若是将待转换数据从低精度格式类型转换为高精度格式类型,则数据格式转换单元相应的增加数据格式的位宽,高位部分补零,低位部分保持原来的数据;若是将待转换数据从高精度格式类型转换为低精度格式类型,则数据格式转换单元相应的减少数据格式的位宽,将低位部分直接截去即可。
而当数据格式类型发生变化后,其存储空间相应的也发生变化:若待转换数据从低精度格式类型转换为高精度格式类型,所需的存储空间增加;待转换数据从高精度格式类型转换为低精度格式类型,所需的存储空间减小,因此每个数据对应的存储地址会发生变化,地址转换单元在接收到控制指令后,则可以根据待转换数据转换前的存储地址、转换前后的数据格式类型计算获得每个待转换数据转换后对应的存储地址。
例如,假设加速计算单元为GPGPU,在计算过程中CPU将SVD分解过程分配给GPGPU来实现,在SVD计算过程中,总共需要进行两次数据交换:一是计算开始时,将初始值从main memory传输到device memory;二是在计算结束后,将计算结果从device memory传输到main memory。假设main memory发射和接收的都是short型数据,而GPGPU计算使用的是double型数据。也就是说,在计算开始时需要将short型的初始值转换为double型数据,而在计算结束时需要经double型的计算结果转换为short型。对于用户来说,这两次数据传输过程只需调用两次本实施例提供的接口函数即可。
在计算开始时,接口函数的调用情况为:MemCopy(destination,source,size,Host2Device,short2double);假设主存储器中存储的short型数据对应的存储地址分别为0和1,在将short型数据转换为double型数据后,在设备存储器中存储的double型数据的存储地址分别映射为0和4,如图8所示;在计算结束后,接口函数的调用情况为:MemCopy(destination,source,size,Device2Host,double2short),设备存储器中存储的double型数据的存储地址分别为0和4,在将double型数据转换为short型数据后,在主存储器中存储的 short型数据的存储地址分别映射为0和1,如图9所示。
应用本实施例提供的数据格式转换装置10的加速计算单元在进行加速计算的过程中,在接收到数据时,可以由加速计算单元的存储器上的buffer芯片中的数据格式转换装置10实现数据格式从低精度格式类型转换到高精度格式类型,而在发送数据时,可以由加速计算单元的存储器上的buffer芯片中的数据格式转换装置10实现数据格式从高精度格式类型转换到低精度格式类型;可以看出,无论是发送数据还是接收数据,都不占用加速器本身的计算资源,而且可以保证主存储器和设备存储器之间的总线上传输的数据总是数据量较小的低精度数据。
并且,通过对大小为64*14*2048的待转换数据块的计算和转换为例进行时间上的对比,在现有技术中,GPU中short型数据转float型数据需要300us,float型数据转short型数据也需要300us。而利用本实施例的数据格式转换装置10,如果在内存buffer中实现的话,在数据读写过程,只要很少的几个时钟周期既可以实现,假设内存模块的时钟频率为1600MHz,那么short型数据转float型数据、float型数据转short型数据都可以在ns量级的时间内实现。由此可见,本实施例提供的数据格式转换装置10能够大大的缩短数据的整体计算时间。
可选地,内存中缓冲芯片(buffer)进行数据格式转化功能的模块可以包括数据格式转换、地址转换、控制模块三个部分,如图10a所示,其中buffer中还包括总线接口、数据缓存、控制缓存、地址缓存。下面对图10a中灰色部分的三个新增的模块分别加以介绍。
数据格式转换模块:其用于数据格式的转换,在写入时将低精度转换为高精度,在读出时将高精度转换为低精度。
地址转换模块:其用于地址空间的映射,实现CPU内存空间与加速器内存空间之间的映射。
控制模块:其作为控制模块,用于控制数据格式转换过程,主要包括是否进行格式转换、格式转换的方式。
进一步地对这三个模块进行介绍:
1)数据格式转换模块
数据格式转换模块根据控制模块的指令,完成数据格式的转换,其具体 的工作模式可以分为三种:
将数据从低精度转到高精度:增加数据格式的位宽,高位部分补零,低位部分保持原来的数据;
将数据从高精度转到低精度:较少数据格式的位宽,将低位部分直接截去;
不进行数据格式转换;
该模块工作模式的选择以及具体的数据格式转换方式、转换后的数据类型,都是根据控制模块指令中的控制字来实现的。
2)地址转换模块
因为数据格式发生变化,导致数据存储所占的存储空间发生变化:数据从低精度转到高精度,所需存储空间增加;数据从高精度转到低精度,所需存储空间减小。因此每个数据对应的存储地址会发生变化。地址转换模块的功能就是根据控制模块的指令,完成数据格式转换前后数据地址空间的映射:即根据转换前数据的地址以及转换前后的数据格式,计算转换后每个数据的地址。
3)控制模块
控制模块的功能是实现对数据格式转换模块和地址转换模块的控制,其具体的实现方式可以是通过在buffer芯片中增加一个模式选择寄存器来完成的,该寄存器包含了所有需要的控制字,可以用图7中类似的模式选择寄存器。其中,Con为转换模式开关控制字,Con=1表示开启数据格式转换,Con=0表示关闭数据格式转;S2I、S2F、I2L、F2D等分别表示格式转换类型,如S2I表示short型转int型,S2F表示short型转float型,I2L表示int型转long int型,F2D表示float型转double型等。因为计算机中的数据格式类型只有不多的几种,因此可以通过枚举的方法来进行设置控制字,控制字为1表示进行相应的格式类型转换,其他控制字为零。
上述模式选择寄存器还可以采用其他的方式来实现,比如可以用编码的方式实现,令0000表示不进行格式转换,0001表示short型转int型,0010表示short型转float型,0011表示int型转long int型等等。
可选地,地址转换模块可以整合到数据格式转换模块中,从而直接利用数据格式转换模块对数据的读写来实现内存中完成数据格式转换,从而不用 占用加速器计算资源,提升整体计算效率。
在上述新增的三个模块中,控制模块通过模式选择寄存器中的控制字来实现对数据格式转换模块和地址转换模块的控制,关于控制字如何从应用程序传递到模式选择寄存器,本发明提出两种方法:
一是在内存条上增加专用的配置线,将该配置线增加到总线的控制线中,用户通过这条专用配置线实现控制字的写入。这种方法需要修改内存条的硬件接口。
第二种方法是利用内存中已有的控制线,修改控制线的协议,增加控制字传输过程。该方法不需要修改内存条的硬件接口,实现相对容易。
进一步地,本发明实施例还提供与数据格式转换相对应的API接口,其用于提供统一的用户接口,用户通过这个接口来调用本发明提出的转换模块,从而实现数据的格式转换。
本实施例的技术方案,该缓冲芯片包括总线接口、地址缓存单元、控制缓存单元和数据缓存单元以及数据格式转换装置,该总线接口用于接收CPU发送的数据拷贝命令,并转发至数据格式转换装置,根据该数据拷贝命令,将拷贝得到的待转换数据缓存至数据缓存单元,将待转换数据的存储地址和格式转换类型分别缓存至地址缓存单元和控制缓存单元,该数据格式转换装置的控制模块用于在接收到数据拷贝命令时,向转换模块发送控制指令;转换模块用于根据接收到的控制指令,完成待转换数据的数据格式转换和存储地址映射的,并将完成数据格式转换后的待转换数据发送给加速计算单元。通过将数据格式转换装置设置在缓冲芯片上,避免了现有技术中在加速计算单元中设置两个数据格式转换单元而导致的额外占用加速计算单元的计算资源和计算时间的问题,实现了在减小CPU的主存储器和加速计算单元的设备存储器之间的数据传输量的同时,不额外占用加速计算单元和CPU的计算资源,保证计算精度的同时提升计算效率。
图10b为本发明实施例提供的数据格式转换方法的流程图。如图10b所示,本实施例提供的方法具体可以由上述实施例提供的缓冲芯片20执行,本实施例提供的方法具体可以包括:
S1001、缓冲芯片的总线接口接收中央处理器CPU发送的数据拷贝命令,根据所述数据拷贝命令,将拷贝得到的待转换数据缓存至数据缓存单元,将 所述待转换数据的存储地址缓存至地址缓存单元,将所述待转换数据的格式转换类型缓存至控制缓存模块;并将所述数据拷贝命令发送给数据格式转换装置的控制模块;
S1002、所述控制模块接收到所述数据拷贝命令时,获取所述待转换数据、所述格式转换类型以及所述待转换数据的地址,并向所述数据格式转换装置的转换模块发送控制指令;所述控制指令用于指示所述转换模块对所述待转换数据进行数据格式转换和存储地址映射;
或者S1002、所述控制模块用于根据接收到的所述数据拷贝命令,向数据格式转换装置的转换模块发送控制指令,所述数据拷贝命令包含所述待转换数据、所述格式转换类型以及所述待转换数据的地址的信息;所述控制指令用于指示所述转换模块对所述待转换数据进行数据格式转换和存储地址映射;
S1003、所述转换模块根据接收到的所述控制指令,完成所述待转换数据的数据格式转换和存储地址映射,并将完成所述数据格式转换后的所述待转换数据发送给加速计算单元。
本步骤中,在进行数据格式转换时,转换模块根据控制模块发送的控制指令中,待转换数据转换前的格式转换类型和待转换数据转换后的数据格式类型,完成待转换数据的数据格式转换;在进行存储地址映射时,转换模块根据控制指令、待转换数据转换前的存储地址、待转换数据转换前的数据格式类型、以及待转换数据转换后的数据格式类型,获得待转换数据转换后的存储地址。
具体的,控制指令中包含控制字;控制字包括转换模式开关控制字和格式转换类型控制字;其中,转换模式开关控制字用于指示转换模块开启对待转换数据的数据格式转换,或者,转换模式开关控制字用于指示转换模块关闭对待转换数据的数据格式转换;格式转换类型控制字用于指示待转换数据转换前的数据格式类型,以及待转换数据转换后的数据格式类型。
具体的,控制指令中包含控制编码;控制编码用于指示转换模块开启对待转换数据的数据格式转换,或者,关闭待转换数据的数据格式转换,或者,控制编码用于指示待转换数据转换前的格式转换类型和待转换数据转换后的数据格式类型。
本实施例中的数据拷贝命令定义为:
MemCopy(destination,source,size,direction,cpytype)
其中,MemCopy是函数名,表示所述CPU的主存储器和所述加速计算单元的设备存储器之间所述待转换数据的拷贝;destination表示所述待转换数据的目的地址;source表示所述待转换数据的原地址;size表示所述待转换数据的大小;direction表示所述待转换数据的拷贝方向:cpytype表示所述格式转换类型。
可选地,完成待转换数据的数据格式转换和存储地址映射可以包括:在写入时直接将待转换的数据存储成控制指令指示的格式转换类型,以及存储在控制指令指示的存储地址,这样,在数据转换装置中待转换数据就被存储成格式转换后的数据了,在读出时,就可以直接得到格式转换后的数据了。从而直接在数据读取写入的过程中完成数据格式转换。
其中,缓冲芯片为内存中的缓冲芯片。上述数据格式转换在对内存的读写过程中就可实现,无需占用加速器的额外计算资源,提高了计算效率。
本实施例提供的数据格式转换方法,可用于实现上述装置实施例的技术方案,其实现原理与技术效果类似,此处不再赘述。
在本发明所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬 件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本领域技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (12)

  1. 一种数据格式转换装置,其特征在于,
    所述装置用于内存中的缓冲芯片,所述装置包括控制模块和转换模块;
    所述控制模块用于根据接收到的所述数据拷贝命令,向所述转换模块发送控制指令,所述数据拷贝命令包含待转换数据、格式转换类型以及待转换数据的地址的信息;所述控制指令用于指示所述转换模块对所述待转换数据进行数据格式转换和存储地址映射;
    所述转换模块用于根据接收到的所述控制指令,完成所述待转换数据的数据格式转换和存储地址映射,并将完成所述数据格式转换后的所述待转换数据发送给加速计算单元。
  2. 根据权利要求1所述的装置,其特征在于,所述完成所述待转换数据格式转换和存储地址映射包括:所述转换模块在写入时将所述待转换的数据存储成所述控制指令指示的格式转换类型,以及存储在所述控制指令指示的存储地址。
  3. 根据权利要求1或2所述的装置,其特征在于,所述控制指令中包含存储在所述模式选择寄存器中控制字;所述控制字包括转换模式开关控制字和格式转换类型控制字;
    所述转换模式开关控制字用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,所述转换模式开关控制字用于指示所述转换模块关闭对所述待转换数据的数据格式转换;
    所述格式转换类型控制字用于指示所述待转换数据转换前的数据格式类型,以及所述待转换数据转换后的数据格式类型。
  4. 根据权利要求1或2所述的装置,其特征在于,所述控制指令中包含存储在所述模式选择寄存器中的控制编码;所述控制编码用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,关闭所述待转换数据的数据格式转换,或者,所述控制编码用于指示所述待转换数据转换前的格式转换类型和所述待转换数据转换后的数据格式类型。
  5. 根据权利要求1-4任一所述的装置,其特征在于,所述转换模块包括数据格式转换单元和地址转换单元;
    其中,所述数据格式转换单元用于根据所述控制模块发送的所述控制指 令,完成所述待转换数据的数据格式转换;
    所述地址转换单元用于根据所述控制指令、所述待转换数据转换前的存储地址、所述待转换数据转换前的数据格式类型、以及所述待转换数据转换后的数据格式类型,获得所述待转换数据转换后的存储地址。
  6. 一种缓冲芯片,其特征在于,包括:总线接口、地址缓存单元、控制缓存单元、数据缓存单元,以及如权利要求1-5任一所述的装置。
  7. 一种数据格式转换方法,其特征在于,包括:
    内存中的缓冲芯片的总线接口接收中央处理器CPU发送的数据拷贝命令,根据所述数据拷贝命令,将拷贝得到的待转换数据缓存至数据缓存单元,将所述待转换数据的存储地址缓存至地址缓存单元,将所述待转换数据的格式转换类型缓存至控制缓存模块;并将所述数据拷贝命令发送给数据格式转换装置的控制模块;
    所述控制模块用于根据接收到的所述数据拷贝命令,向数据格式转换装置的转换模块发送控制指令,所述数据拷贝命令包含所述待转换数据、所述格式转换类型以及所述待转换数据的地址的信息;所述控制指令用于指示所述转换模块对所述待转换数据进行数据格式转换和存储地址映射;
    所述转换模块根据接收到的所述控制指令,完成所述待转换数据的数据格式转换和存储地址映射,并将完成所述数据格式转换后的所述待转换数据发送给加速计算单元。
  8. 根据权利要求7所述的方法,其特征在于,所述控制指令中包含控制字;所述控制字包括转换模式开关控制字和格式转换类型控制字;
    其中,所述转换模式开关控制字用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,所述转换模式开关控制字用于指示所述转换模块关闭对所述待转换数据的数据格式转换;
    所述格式转换类型控制字用于指示所述待转换数据转换前的数据格式类型,以及所述待转换数据转换后的数据格式类型。
  9. 根据权利要求7所述的方法,其特征在于,所述控制指令中包含控制编码;
    所述控制编码用于指示所述转换模块开启对所述待转换数据的数据格式转换,或者,关闭所述待转换数据的数据格式转换,或者,所述控制编码用 于指示所述待转换数据转换前的格式转换类型和所述待转换数据转换后的数据格式类型。
  10. 根据权利要求7-9任一所述的方法,其特征在于,所述转换模块根据接收到的所述控制指令,完成所述待转换数据的数据格式转换和存储地址映射,包括:
    所述转换模块根据所述控制模块发送的所述控制指令中,所述待转换数据转换前的格式转换类型和所述待转换数据转换后的数据格式类型,完成所述待转换数据的数据格式转换;
    所述转换模块根据所述控制指令、所述待转换数据转换前的存储地址、所述待转换数据转换前的数据格式类型、以及所述待转换数据转换后的数据格式类型,获得所述待转换数据转换后的存储地址。
  11. 根据权利要求7-10任一所述的方法,其特征在于,所述数据拷贝命令定义为:
    MemCopy(destination,source,size,direction,cpytype)
    其中,MemCopy是函数名,表示所述CPU的主存储器和所述加速计算单元的设备存储器之间所述待转换数据的拷贝;destination表示所述待转换数据的目的地址;source表示所述待转换数据的原地址;size表示所述待转换数据的大小;direction表示所述待转换数据的拷贝方向:cpytype表示所述格式转换类型。
  12. 根据权利要求7-11任一所述的方法,其特征在于,所述完成所述待转换数据格式转换和存储地址映射包括:所述转换模块在写入时将所述待转换的数据存储成所述控制指令指示的格式转换类型,以及存储在所述控制指令指示的存储地址。
PCT/CN2015/077311 2015-04-23 2015-04-23 数据格式转换装置、缓冲芯片及方法 WO2016169032A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201580076244.0A CN107209663B (zh) 2015-04-23 2015-04-23 数据格式转换装置、缓冲芯片及方法
PCT/CN2015/077311 WO2016169032A1 (zh) 2015-04-23 2015-04-23 数据格式转换装置、缓冲芯片及方法
US15/789,011 US10402119B2 (en) 2015-04-23 2017-10-20 Data format conversion apparatus and method and buffer chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/077311 WO2016169032A1 (zh) 2015-04-23 2015-04-23 数据格式转换装置、缓冲芯片及方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/789,011 Continuation US10402119B2 (en) 2015-04-23 2017-10-20 Data format conversion apparatus and method and buffer chip

Publications (1)

Publication Number Publication Date
WO2016169032A1 true WO2016169032A1 (zh) 2016-10-27

Family

ID=57143743

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/077311 WO2016169032A1 (zh) 2015-04-23 2015-04-23 数据格式转换装置、缓冲芯片及方法

Country Status (3)

Country Link
US (1) US10402119B2 (zh)
CN (1) CN107209663B (zh)
WO (1) WO2016169032A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434781A (zh) * 2019-08-26 2021-03-02 上海寒武纪信息科技有限公司 用于处理数据的方法、装置以及相关产品
US11934668B2 (en) * 2020-09-02 2024-03-19 Samsung Electronics Co., Ltd. Electronic device with storage device data conversion

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10997492B2 (en) * 2017-01-20 2021-05-04 Nvidia Corporation Automated methods for conversions to a lower precision data format
CN112394993A (zh) * 2019-08-13 2021-02-23 上海寒武纪信息科技有限公司 半精度浮点转短整形指令处理装置、方法及相关产品
CN113766270B (zh) * 2021-02-26 2024-06-18 北京沃东天骏信息技术有限公司 视频播放方法、系统、服务器、终端设备、以及电子设备
CN112948129A (zh) * 2021-03-30 2021-06-11 深圳致星科技有限公司 基于联邦学习的数据传输优化方法、设备及可读存储介质
CN114327256A (zh) * 2021-11-22 2022-04-12 南京风兴科技有限公司 一种用于神经网络处理器的数据格式在线转换架构及方法
CN118368346B (zh) * 2024-06-14 2024-08-30 深圳三铭电气有限公司 一种总线多协议转换控制方法、装置、设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102089741A (zh) * 2007-09-17 2011-06-08 国际商业机器公司 在附接的高性能并行计算机上执行计算机密集型数据库用户定义的程序
CN103257847A (zh) * 2007-12-26 2013-08-21 英特尔公司 用于转换向量数据的方法、装置和指令
CN104364755A (zh) * 2012-05-19 2015-02-18 维努·坎达戴 用于通过中间阶层运算的并行计算来加速计算的方法和装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000028518A2 (en) * 1998-11-09 2000-05-18 Broadcom Corporation Graphics display system
JP3138693B2 (ja) * 1999-01-05 2001-02-26 茨城日本電気株式会社 データ圧縮回路
CN1286321C (zh) * 2004-06-11 2006-11-22 上海大学 基于mcf5272平台的mpeg-4图像传输方法及系统
US20080175137A1 (en) * 2007-01-23 2008-07-24 Mediatek Inc. Method for encoding data written to optical storage media
US8898324B2 (en) * 2010-06-24 2014-11-25 International Business Machines Corporation Data access management in a hybrid memory server
US10069896B2 (en) * 2015-11-01 2018-09-04 International Business Machines Corporation Data transfer via a data storage drive
US20180150256A1 (en) * 2016-11-29 2018-05-31 Intel Corporation Technologies for data deduplication in disaggregated architectures
CN107317584B (zh) * 2017-06-28 2020-11-06 上海兆芯集成电路有限公司 加速压缩方法以及加速压缩装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102089741A (zh) * 2007-09-17 2011-06-08 国际商业机器公司 在附接的高性能并行计算机上执行计算机密集型数据库用户定义的程序
CN103257847A (zh) * 2007-12-26 2013-08-21 英特尔公司 用于转换向量数据的方法、装置和指令
CN104364755A (zh) * 2012-05-19 2015-02-18 维努·坎达戴 用于通过中间阶层运算的并行计算来加速计算的方法和装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434781A (zh) * 2019-08-26 2021-03-02 上海寒武纪信息科技有限公司 用于处理数据的方法、装置以及相关产品
US11934668B2 (en) * 2020-09-02 2024-03-19 Samsung Electronics Co., Ltd. Electronic device with storage device data conversion

Also Published As

Publication number Publication date
CN107209663A (zh) 2017-09-26
US20180039446A1 (en) 2018-02-08
CN107209663B (zh) 2020-03-10
US10402119B2 (en) 2019-09-03

Similar Documents

Publication Publication Date Title
WO2016169032A1 (zh) 数据格式转换装置、缓冲芯片及方法
JP6796304B2 (ja) 最終レベルキャッシュシステム及び対応する方法
CN110647480B (zh) 数据处理方法、远程直接访存网卡和设备
US11010056B2 (en) Data operating method, device, and system
CN112463714A (zh) 远程直接内存访问方法、异构计算系统及电子设备
KR102617360B1 (ko) 바이트 어드레스 지정 가능한 메모리로서 비-휘발성 메모리에 액세스하는 방법 및 장치
TW201714090A (zh) 記憶體裝置、記憶體定址方法與包括有形儲存媒體的物品
WO2018041074A1 (zh) 一种内存设备的访问方法、装置和系统
CN113032293A (zh) 缓存管理器及控制部件
CN105205025A (zh) 一种芯片互连的方法、芯片及装置
CN117312201B (zh) 一种数据传输方法、装置及加速器设备、主机和存储介质
CN107250995B (zh) 存储器管理设备
CN115374046A (zh) 一种多处理器数据交互方法、装置、设备及存储介质
US10489322B2 (en) Apparatus and method to improve performance in DMA transfer of data
US8521968B2 (en) Memory controller and methods
US20240193084A1 (en) Storage System and Method for Accessing Same
US10198219B2 (en) Method and apparatus for en route translation in solid state graphics systems
US11409539B2 (en) On-demand programmable atomic kernel loading
CN101488119B (zh) 地址译码方法、装置及单板
CN113961487A (zh) 加速存储器存取的电子装置及方法
CN113806431A (zh) 一种传输仿真数据的方法、电子系统及存储介质
US10261700B1 (en) Method and apparatus for streaming buffering to accelerate reads
WO2023142114A1 (zh) 数据处理方法、装置以及电子设备
CN114840458B (zh) 读写模块、片上系统和电子设备
CN117056263A (zh) Spi控制器、控制方法、系统级芯片以及蓝牙设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15889521

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15889521

Country of ref document: EP

Kind code of ref document: A1